Monday, February 17, 2025

How I Learned to Stop Worrying and Love the LLM part 1-- coding

I used to ask, only half-jokingly, how any of us learned to code before Google.  I went to grad school in the mid-90s, so I'm old enough to remember that literal bookshelf of not-very-helpful SAS manuals.  Getting a straightforward answer to a basic coding question often seemed insurmountable.  You can imagine, then, the revolutionary impact of the internet and various online resources.

The advance represented by LLMs has been comfortable. While I would never consider using one in a situation where I needed the kind of background and understanding that comes from a textbook or course, in terms of straightforward "how do I code this" questions, I can no more imagine Googling the topic or turning to an online forum than I can imagine digging through that old stack of phone books (I just can't stop dating myself in this post).

Out of perhaps excessive caution, I never give ChatGPT any real data or metadata. The tables I rename something unimaginative like "A" or "B."  For the fields, I try to use something that would fall in the same general category.  For example, Buyer_ID might become SSN.  There are no doubt countless examples of social security numbers in the LLMs' training data, and pretty much all of them treat it as a unique identifier. 

It does have limitations and will get some examples wrong, particularly if you let things get too complex, but if you can keep things bite sized and be absolutely clear with your logic, the LLM performs remarkably well. I don't know if this makes up for the huge environmental cost of building these models and it certainly doesn't balance out the damage generative AI has done, but if used properly, these are remarkably useful and powerful tools.

2 comments:

  1. I always think focusing on the environmental impact of these large models to be missing the plot, as this is really tiny compared to the other impacts they are having. For reference, from the most up to date data I can find from 15 minutes of googling around the largest CO2 emission number for training any one model is 40,000 tons (gemini ultra in Dec. 2023). That works out to be about to the same as the yearly carbon emission for about 2500 Americans, or 0.0006% of US carbon emissions per year. I have yet to find a compelling analysis for why I should care about this.

    ReplyDelete
  2. The plan is to change the standard data center rack from a 25 kW device to be a 125 kW device (that's roughly: basically the next gen GPU racks will draw 5 times the power). The rack design (with the next gen Nvidia chips) is in place. The physical plant isn't. The cooling noise is already so loud that most noise protection gear is inadequate to protect operators during the regular (required, multiple times per day) inspections. (I have a close friend who manages data centers and a nephew who designs cooling systems therefor. They.Have.Horror.Stories.)

    These guys put out some fun reads: https://semianalysis.com/

    But, yes. If all datacenters are updated to 125 kW racks, it'll maybe begin to be a problem. Meanwhile cars and planes are the things that are destroying the remaining human-survivable parts of the earth...

    Dunno how great the coding is, though. People seem to like it, but for the stuff I'm doing (handling Japanese text) I can't imagine it being helpful. That is, I'm not doing things that have been done before. Cranking out another cell phone app could use some help with the boilerplate, I'd guess...

    ReplyDelete