AI singer-songwriter ‘Anna Indiana’ debuted her first single ‘Betrayed by this Town’ on X, formerly Twitter—and listeners were not too impressed.

  • azimir@lemmy.ml
    link
    fedilink
    arrow-up
    80
    ·
    1 year ago

    I’ve been doing a lot of using, testing, and evaluating LLMs and GPT-style models for generating code and text/prose. Some of it is just general use to see how it behaves, some has been explicit evaluation of creative writing, and a bunch of it is code generation to test out how we need to modify our CS curriculum in light of these new tools.

    It’s an impressive piece of technology, but it’s not very creative. It’s meh. The results are meh. Which is to be expected since it’s a statistical model that’s using a large body of prior work to produce a reasonable approximation of what it’s seen before. It trends towards the mean, not the best.

    • AgnosticMammal@lemmy.zip
      link
      fedilink
      arrow-up
      20
      ·
      1 year ago

      This’d explain why inexperienced users of ai would inevitably get mediocre results. Still takes creativity to get stolen mediocrity.

      • TheMechanic@lemmy.ca
        link
        fedilink
        arrow-up
        26
        ·
        1 year ago

        You have to know how to operate the oven to reheat store bought pie. Generative LLMs are machines like ovens, and turning the knobs is not creativity. Not operating the oven correctly gets you Sharon Weiss results.

      • anachronist@midwest.social
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        I guess a protip is you have to tell it explicitly in the prompt who it’s supposed to steal from.

        For instance, midjourney or SD will produce much better results if you put specific artstation channel names along with ‘artstation’ in the prompt.

    • queermunist she/her@lemmy.ml
      link
      fedilink
      arrow-up
      13
      ·
      1 year ago

      I’m excited for how these tools will be used by human creators to accomplish things they could never do alone, and in that aspect it is a revolutionary technology. I hate that their marketing calls it “AI” though, the only intelligence involved is the human user that creates prompts and curates results.

    • Unaware7013@kbin.social
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      and a bunch of it is code generation to test out how we need to modify our CS curriculum in light of these new tools.

      I’m curious if you’ve gotten anything decent out of them. I’ve tried to use it for tech/code questions, and it’s been nothing but disappointment after disappointment. I’ve tried to use it to get help with new concepts, but it hallucinates like crazy and always give me bad results, some of the time it’s so bad that it gives me answers I’ve already told it we’re wrong.

      • aiccount@monyet.cc
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        Yeah, I’ve just set up a hotkey that says something like “back up your answer with multiple reputable sources” and I just always paste it at the end of everything I ask. If it can’t find webpages to show me to back up its claims then I can’t trust it. Of course this isn’t the case with coding, for that I can actually run the code to verify it.

      • kromem@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        What version are you using?

        GPT-4 is quite impressive, and the dedicated code LLMs like Codex and Copilot are as well. The latter must have had a significant update in the past few months, as it’s become wildly better almost overnight. If trying it out, you should really do so in an existing codebase it can use as a context to match style and conventions from. Using a blank context is when you get the least impressive outputs from tools like those.

        • Unaware7013@kbin.social
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          I’ve used gpt 3/3.5, bing, bard and copilot, and I’m not super stoked. Copilot gave me PS DSC items that don’t actually exist, which was my most recent attempt at using a LLM.

          I might see about figuring out if it can hook into my vs code instance so it’s a bit smarter at some point.

          • kromem@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            I might see about figuring out if it can hook into my vs code instance so it’s a bit smarter at some point.

            There’s an official plug-in to do this that takes like 15 minutes to set up.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      It trends towards the mean, not the best.

      That’s where some of the significant advances over the past 12 months of research have been, specifically around using the fine tuning phase to bias towards excellence. The biggest advance there has been that capabilities in larger models seem to be transmissible to smaller models by feeding in output from the larger more complex models.

      Also, the process supervision work to enhance CoT from May is pretty nuts.

      So while you are correct that the pretrained models come out with a regression towards the mean, there are very promising recent advances in taking that foundation and moving it towards excellence.