Things I’ve used AI for

AI (LLMs) are good at some things and not good at others and nobody seems to be able to agree on what those things are. This is my attempt at describing my personal experience with these tools, for better or worse. I don’t stick to any particular model, most of the time I’m actually using at least 2 to cross check answers. I also haven’t paid for any AI use (yet). I know that means I’m not using the best out there, but until I run into something I’m not satisfied with I don’t see a reason to always use the frontier model.

Scripting:

I am not a software engineer. I do not write code all day. I don’t even like writing code all that much. Anyone that does enjoy writing code for the sake of it or does this for a living probably has a very different view on how useful each model is, if at all.

What I do like is answering questions and solving problems and sometimes that means I have to make my own tools and write scripts to get things done. LLMs have been useful to me in this way because whatever I make doesn’t have to be something that is ready to go into a production level codebase that anyone else depends on, it just needs to work.

Things they’ve helped with:

  • Talking through a project idea and creating design docs
  • Generating scripts
  • Reviewing my own code
  • Explaining code that I hadn’t written

Something I do make sure to do is go through any generated or changed code to make sure I can explain why something is the way it is.

App Generation

This is similar, but I don’t care about knowing anything about the code. My example for this is that I needed a way to annotate code with comments that appeared similar to Google docs, but not on their platform. annotate.dev came close, but it didn’t have the privacy settings I needed to share things without paying to add more members to my group. I used Gemini to make a tool that let me drop in code, make any comments I wanted to on the side with color coding, and then export the results as an html file I could give people.

I think a lot of the time people talk about enshittification they really mean they just didn’t like a product design decision, but something like this can solve that problem for them. If the app you’re using makes a change you don’t like, you can just build something that works exactly how you want it to.

Writing and Reviewing:

I don’t want an LLM writing anything that really matters for me. The words may make sense, but the tone never seems natural. At most I will ask it to extend something that needs to be longer for the sake of being longer and even then I usually go back and rewrite most of it.

I do ask for it to review the first few drafts of some things I write. I don’t think they have a perfect understanding of how people might read something, so I will always ask a real person to look at it at least once afterwards, but it does save someone else the time of reading my completely unpolished first draft.

You have to be explicit about what it should look for though. Emphasize that the wording and tone should stay mostly the same to keep the original voice. Ask for it to focus on things like finding factual errors, looking for areas that might not make sense to readers, or other connecting ideas that you might have overlooked.

OCR, image search, and Live Camera:

If I take a screenshot of text on my phone I can hold down a button and have Gemini help grab the text from it to use as alt text. Not the biggest deal in the world, but I think making accessibility easy to implement helps. I’m curious if I could get descriptions of images that are good enough to use but I haven’t tried setting something like that up yet.

If I see something on my phone that I want to look for, I hold down the button and use Gemini as a google image search.

If I point my phone camera at something, I can ask Gemini questions about it. This worked a lot better than I was expecting. I think the think I was most impressed with was correctly estimating how much liquid was in a container.