Veo 3 and Imagen 4, and a new tool for filmmaking called Flow

5/21/2025 at 10:32:28 AM

This demo video of Veo 3 on reddit, featuring a variety of characters talking in different scenarios and accents, is one of the most incredible AI demos I have ever seen: https://www.reddit.com/r/ChatGPT/comments/1krmsns/wtf_ai_vid...

Created by Ari Kuschnir

by oliwary

5/21/2025 at 2:57:05 PM

Good lord.

I think the change here will be something we've seen with the other modalities. Text was interestingly syntactically correct but nonsense sentences. Then paragraphs but the end of the article would go off the rails. Then the article. Now it's that the creativity of the children's story in question.

Pictures were awful fever dreams filled with eyes but you could kind of see a dog. Then you could see what it was, then decent

Videos were fun that they kind of worked, then surprising it took a few seconds for the panda to turn into spaghetti, then it kept the general style for a decent time.

I see this moving towards the creativity being the major thing, or it having a few general styles (softly lit background for example).

This has mostly all shifted in a very short space of time and as someone who put RBMs on GPUs possibly for the first time (I'm gonna claim it) this is absolutely wild.

Had I seen some of this, say, 6 months ago I'd not have guessed at all bits weren't real.

by IanCal

5/22/2025 at 3:28:10 PM

A very important demo video I also found on reddit is this one [0] that's a fairly generic series of action scenes of a raid leading to a gun fight. The individual scenes are look mostly fine, notable exceptions being the muzzle flashes and nonsense guns in a few shots, but the connecting flow is nonsense if you look at it even a little. It has some of the consistency issues that are a bit of a halmark of AI videos, the interior size and layout of rooms and vehicles morphs and shifts from 'shot' to 'shot', they get out of the vehicle twice, etc. The wheels really come off in the more actiony scene though with the pace feeling very plodding for what would be an intense scene in even a moderately competitent human editor. Also some of the 'cops' wind up shooting each other in one scene which was a funny mistake.

[0] https://www.reddit.com/r/ChatGPT/comments/1kru6jb/this_video...

by rtkwe

5/21/2025 at 3:01:19 PM

Last night my girlfriend asked me why I kept watching the same bland sounding videos again and again. She came over and watched for a bit, gave a sort of confused laugh of solidarity to something, like "Uhh, why is he so into this? But, ok, I guess..." and then walked away.

It wasn't until I was able to get my jaw off the ground that I told her it was AI. No, not AI like special effects, completely AI.

by Workaccount2

5/21/2025 at 3:43:46 PM

If you have ever tried to use VEO in Google's AI studio, it lets you upload a starting frame image and ending frame image which is cool.

But they do not allow any people in the image even cartoon depictions of humans. This knee caps a lot of potential usage.

by tmaly

5/21/2025 at 8:21:23 PM

Looks like AI crossed a line. At the very least, one person can do long form documentaries from their basement using VEO 3. There is no need for camera shoots. Yikes.

This reminds me of Pixar's video of an animated lamp 40 years ago. I remember that within 5 years Toy Story came out and changed everything on how animated films were made. Looks to me like we are on our way to doing the same thing with realistic movies.

by WheelsAtLarge

5/22/2025 at 2:07:16 PM

What are they documenting if it’s entirely AI generated images?

by bathtub365

5/23/2025 at 2:42:38 AM

Good point, how about something history related? Ken burns does something similar with photos. I've also seen animation used in documentaries. So use AI instead. How about a mock documentary? Spinal Tap comes to mind.

by WheelsAtLarge

5/21/2025 at 2:15:32 PM

Calling it now.

Someone will use AI to make the "AI Killed the Video Star" video. Probably the same guy that made this[1] and other masterpieces.

[1]https://www.youtube.com/watch?v=EICWYazyqu4

by marcyb5st

5/21/2025 at 3:31:17 PM

I thought you were going to link "Video Killed The YTMND Star" - which gives me quite the dose of nostalgia: https://www.youtube.com/watch?v=D6D9arrHiLE

by epiccoleman

5/21/2025 at 7:54:57 PM

It’s true. YTMND nailed the TikTok / Vine format like 12 years ahead of its time. If only they’d “pivoted to mobile” and added more ease of use creation tools they may have stayed relevant.

by kridsdale1

5/20/2025 at 6:22:12 PM

After doing some testing, Imagen 4 doesn't score any higher than Imagen 3 on my comparison chart, approximately ~60% prompt adherence accuracy.

https://genai-showdown.specr.net

by vunderba

5/20/2025 at 10:56:19 PM

I'm curious why you decide to declare victory after one successful attempt, but try many times for unsuccessful models. Are you trying to measure whether a model _can_ get it right, or whether it frequently _does_ get it right? I feel like success rate is a better metric here, or at least a fixed number of trials with some success rate threshold to determine model success.

by bigmadshoe

5/21/2025 at 3:21:15 AM

It's hard to nail down a good objective metric on something that is always going to be marginally qualitative in nature but it's a good call out - I should probably add a FAQ to the site.

To clarify this test is purely a PASS/FAIL - unsuccessful means that the model NEVER managed to generate an image adhering to the prompt. So as an example, Midjourney 7 did not manage to generate the correct vertical stack of translucent cubes ordered by color in 64 gen attempts.

It's a little beyond the scope of my site but I do like the idea of maintaining a more granular metric for the models that were successful to see how often they were successful.

by vunderba

5/21/2025 at 3:32:21 AM

Makes sense. It just set off some statistical alarm bells in my head to see a model marked as passing with 1 trial, and some models marked as failing with 5. What if the probability of success is 5% for both models? How confident are we that our grading of the models is correct? It's an interesting problem.

Cool site btw! Thanks for sharing.

by bigmadshoe

5/21/2025 at 6:21:11 AM

The current metric is actually quite strong -- it mirrors the real-world use case of people trying a few times and being satisfied if any of them's what they're looking for. It rewards diversity of responses.

Actually, search engines do this this too: Google something with many possible meanings -- like "egg" -- on Google, and you'll get a set of intentionally diversified results. I get Wikipedia; then a restaurant; then YouTube cooking videos; Big Green Egg's homepage; news stories about egg shortages. Each individual link is very unlike the others to maximize the chance that one of them's the one you want.

by npinsker

5/21/2025 at 6:06:18 AM

Its made a little bit better by the fact that there's something like a dozen different prompts. Across all of the prompts each model had a fair number of opportunities to show off.

by Taek

5/21/2025 at 2:51:50 AM

It is indicative of marginal improvements instead of new breakthroughs. iPhone 1 was a paradigm shift. iPhone 10 was essentially iPhone 9 with tweaks. As an AI optimist I would be disappointed to find we are already seeing diminishing returns on R&D.

by ipnon

5/21/2025 at 4:35:59 AM

Not only did the iPhone 9 never exist, the iPhone X was a huge paradigm shift in design and capabilities. That was the phone that introduced edge-to-edge OLED screens to the iPhone line, as well as the IR camera that enabled FaceID and the first generation of Portrait mode. I know it well since it also introduced the ability for developers to build facial motion capture apps that would’ve previously required expensive pro hardware and allowed people like me to build live facial motion capture effects for theatre.

Sorry to dunk so hard, but your example of technology stagnating is actually an example of breakthrough technological innovation deep into a product’s lifecycle: the very thing you were trying to say doesn’t happen.

by Uehreka

5/21/2025 at 12:17:44 PM

Arguably OLED screens and IR cameras are no paradigm shift. At least nothing comparable to "no smartphone" to iPhone 1.

by ofrzeta

5/21/2025 at 2:26:25 PM

There were smartphones before the iPhone. One could also describe the difference as "just a touchscreen".

by jere

5/21/2025 at 2:48:40 PM

The iPhone 1 featured "touch screen, GPS, camera, iPod, and internet access. Its software capabilities were a turning point for the smartphone industry" (random source: https://www.textline.com/blog/smartphone-history).

If you want to doubt that it was in fact a not a turning point you'd need to provide very strong arguments.

by ofrzeta

5/21/2025 at 6:37:19 PM

All of the things you mentioned were available in phones before the first iPhone (assuming by ipod you mean mp3 player). In fact from a software point of view it was lacking a bunch of functionality and software ecosystem some competitors had.

In my view the reason the iPhone felt so new was almost entirely the incredibly responsive capacitive touch screen with a finger ui, everything I'd used before it did resistive and preferred pen for detail. Pen actually is better for detail so in some ways it was that more than anything else that turned the device from a creation device to a consumption device which was whole new way of thinking about smart personal devices.

Of course it was also sold in a decent package too where Apple did deals that ensured it was available with good mobile internet plans which were also unusual at the time.

by kybernetikos

5/21/2025 at 3:24:55 PM

Also, the touchscreen was the type that unlike all previous touchscreens (except the ones made by a startup that Apple had bought) could detect touches at more than one screen location simultaneously.

by hollerith

5/21/2025 at 5:59:50 PM

as someone who owned a number of smartphones and PDAs prior to the first iPhone coming out, the real advance was a usable mobile browser. i'd had all the same capabilities with devices for quite some time before the iphone came out, but their browsers were painful to use. the touch interface was also a big advance over previous touch interfaces. in other areas the first iphone was lacking compared to other smartphones.. copy and paste and 3rd party apps were missing for example.

by spogbiper

5/21/2025 at 5:06:58 AM

technological advancements yes, but did it drastically changed how users (majority) use the iphone. I'd say marginally. Fancy selfie filters, ok i'll give it that. But edge to edge screens, meh, give me back my home button :D

by vincnetas

5/21/2025 at 10:48:57 AM

Just moments ago, I managed to turn a photo of a person into a short clip of them dancing, in half-decent quality, fully locally, on a mid-range gaming GPU (RTX 4070 Ti, 12GB VRAM). I almost run out of RAM (32GB), but it worked, worked well, and took only couple of minutes.

Half a year ago, that was sort of possible for some genius really bent on making it happen. A year ago, that was unthinkable. Today, it's a matter of drag&dropping a workflow to a fresh ComfyUI install and downloading a couple dozen GB of img2vid models.

The returns on R&D are not diminishing, the progress is just not happening everywhere evenly and at the same time.

by TeMPOraL

5/21/2025 at 4:10:51 AM

At the risk of unbearable pedantry, there's never been an iPhone 9. (There was never a 2 either; there was kind of a 3, although it was really called the 3G.)

by sethaurus

5/20/2025 at 7:20:29 PM

The winning image entry for "The Yarrctic Circle" by OpenAI 4o doesn't actually wields a cutlass. It's very aesthetically pleasing, even though it's so wrong in all fundamental aspects (perspective is nonsensical and anatomy is messed up, with one leg 150% longer than the other, ...).

It's a very interesting resource to map some of the limits of existing models.

by woolion

5/20/2025 at 10:37:49 PM

In my own testing between the two this is what I’ve noticed. Imagen will follow the instructions, and 4o will often not, but produces aesthetically more pleasing images.

I don’t know which is more important, but I would say that people mostly won’t pay for fun but disposable images, and I think people will pay for art but there will be an increased emphasis on the human artist. However users might pay for reliable tools that can generate images for a purpose, things like educational illustrations, and those need to be able to follow the spec very well.

by danpalmer

5/21/2025 at 12:19:05 AM

People pay for digital sticker packs so their memoji in iMessage are customized. How much money they make on sticker packs is unknown to me, but image generation platform Midjourney seems to be doing alright.

by fragmede

5/21/2025 at 3:29:04 AM

Midjourney got in REALLY early in the GenAI game despite only allowing image generation through Discord for at least a year. I heard that it was one of the largest Discord channels ever having something absurd like 20+ million members.

I'd love to see some financials but I'd tend to agree they're probably doing pretty well.

by vunderba

5/21/2025 at 1:00:55 PM

o4-mini-high I’ve noticed is far better the 4o on prompt adherence in image generation in personal use.

by ilikehurdles

5/20/2025 at 9:36:43 PM

Google Flow is remarkable as video editing UX, but Imagen 4 doesn't really stand out amongst its image gen peers.

I want to interrupt all of this hype over Imagen 4 to talk about the totally slept on Tencent Hunyuan Image 2.0 that stealthily launched last Friday. It's absolutely remarkable and features:

- millisecond generation times

- real time image-to-image drawing capabilities

- visual instructivity (eg. you can circle regions, draw arrows, and write prompts addressing them.)

- incredible prompt adherence and quality

Nothing else on the market has these properties in quite this combination, so it's rather unique.

Release Tweet: https://x.com/TencentHunyuan/status/1923263203825549457

Tencent Hunyuan had a bunch of model releases all wrapped up in a product that they call "Hunyuan Game", but the Hunyuan Image 2.0 real time drawing canvas is the real star of it all. It's basically a faster, higher quality Krea: https://x.com/TencentHunyuan/status/1924713242150273424

More real time canvas samples: https://youtu.be/tVgT42iI31c?si=WEuvie-fIDaGk2J6&t=141 (I haven't found any other videos on the internet apart from these two.)

You can see how this is an incredible illustration tool. If they were to open source this, this would immediately become the top image generation model over Flux, Imagen 4, etc. At this point, really only gpt-image-1 stands apart as having godlike instructivity, but it's on the other end of the [real time <--> instructive] spectrum.

A total creative image tool kit might just be gpt-image-1 and Hunyuan Image 2.0. The other models are degenerate cases.

More image samples: https://x.com/Gdgtify/status/1923374102653317545

If anyone from Tencent or the Hunyuan team is reading this: PLEASE, PLEASE, PLEASE OPEN SOURCE THIS. (PLEASE!!)

by echelon

5/20/2025 at 10:35:57 PM

> but Imagen 4 doesn't really stand out amongst its image gen peers.

In this AI rat race, whenever one model gets ahead, they all tend to reach parity within 3-6 months. If you can wait 6 months to create your video I'm sure Imagen 5 will be more than good enough.

It's honestly kind of ridiculous the pace things are moving at these days. 10 years ago waiting a year for something was very normal, nowadays people are judging the model-of-the-week against last week's model-of-the-week but last week's org will probably not sleep and they'll release another one next week.

by dheera

5/20/2025 at 10:12:53 PM

This is amazing, can’t see how I’ve missed it. Thank you!

by Narciss

5/21/2025 at 12:58:28 PM

I've given this some more thought. Even if Imagen 4 isn't that great on its own, all of Google's models and UX products in conjunction (Veo 3, Flow, etc.) are orders of magnitude above the rest of the playing field.

If Tencent wants to keep Google from winning the game, they should open source their models. From my perspective right now, it looks like Google is going to win this entire game, and open source AI might be the only way to stop that from being a runaway victory.

by echelon

5/21/2025 at 12:11:08 AM

Good catch - that's on me I accidentally uploaded the wrong image for gpt-image-1. Fixed!

by vunderba

5/20/2025 at 9:16:29 PM

I can't find the image you're talking about. Link pls?

by NoahZuniga

5/20/2025 at 8:26:09 PM

Hands in Winning entry in "Not the Bees" are very unlike any driver. I wouldn't count it as a pass.

by tintor

5/20/2025 at 10:32:05 PM

I hate to say it but I feel like as a result of staring at so many equivalents of Tyrone Rugen since the dark ages of Stable Diffusion 1.5 - I literally DID NOT EVEN notice that until you called it out. The training data in my wetware has been corrupted.

by vunderba

5/20/2025 at 8:22:43 PM

More difficult examples:

- wine glass that is full to the edge with wine (ie. not half full)

- wrist watch not showing V (hands at 10 and 2 o'clock)

- 9 step IKEA shelf assembly instruction diagram

- any kind of gymnastics / sport acro

by tintor

5/21/2025 at 12:10:56 AM

What's the reason to test the "not showing ..."? I've never seen anyone make that kind of request in real life. They ask for what they actually want instead. You'd ask for a clock showing 3:25 rather than "not 10:10".

I mean, it's a fun edge case, but I'm practice - does it matter?

by viraptor

5/21/2025 at 12:32:03 AM

> I mean, it's a fun edge case, but I'm practice - does it matter?

*in practice, not I'm practice. (I swear I have a point, I'm not being needlessly pedantic.) In English, in images, mistakes stick out. Thus negative prompts are used a lot for iterative image generation. Even when you're working with a human graphics designer, you may not know what exactly you want, but you know that you don't want (some aspect of) the image in front of you.

Ie: "Not that", for varying values of "that".

by fragmede

5/21/2025 at 1:29:49 AM

> Thus negative prompts are used a lot for iterative image generation.

Are they still? The negative keywords were popular in SD era. The negative prompt was popular with later models in advanced tools. But modern iterations look different - the models capable of editing are perfectly fine with processing the previous image with a prompt "remove the elephant" or "make the clock show a different time". Are the negative parts in the initial prompt still actually used in iteration?

by viraptor

5/20/2025 at 8:16:48 PM

How can you tell you're using Imagen 4 and not Imagen 3? Gemini seems unable to tell me which model it's using. Are you using Vertex AI?

by strongpigeon

5/20/2025 at 10:24:12 PM

I used Whisk. The model listing shows 3/4 because testing against Imagen 4 did not result in a measurable increase in accuracy from Imagen 3.

https://labs.google/fx/tools/whisk

by vunderba

5/20/2025 at 9:49:10 PM

Well they've labelled it 3/4 so I'm guessing they can't but you can use 4 it in whisk

by sidibe

5/20/2025 at 8:40:06 PM

Tell me you’re using Imagen 3 without telling me you’re using Imagen 4… or something

by EGreg

5/21/2025 at 9:33:36 AM

Side note. It's my understanding that being a pith helmet is pretty orthogonal to having a spike. Plenty of helmets with spikes aren't pith helmets and plenty of pith helmets don't have spikes.

Not sure if this affects your results or not but I resist chiming in!

by andybak

5/21/2025 at 9:45:46 AM

Also "Hippity Hop" is a Space Hopper! Wikipedia agrees with me: https://en.wikipedia.org/wiki/Space_hopper :)

I wonder how much the commonality or frequency of names for things affects image generation? My hunch is that it it roughly correlates and you'd get better results for terms with more hits in the training data. I'd probably use Google image search as a rough proxy for this.

by andybak

5/20/2025 at 6:32:04 PM

How do companies like https://icon.com do their image Gen if the existing SOTA for prompt adherence is so poor?

by Onavo

5/20/2025 at 7:10:44 PM

People who generate images for ads probably don't often need strict prompt adherence, just a random backdrop to slap a picture of their product on top of. The kind of thing they'd have used a stock image library for before.

Also "create static + video ads that are 0-99% complete" suggests the performance is hit or miss.

by yorwba

5/21/2025 at 2:21:01 AM

Exactly this. It just helps the foundation which doesn’t need specific details in most cases.

by AsmodiusVI

5/20/2025 at 7:00:48 PM

fine tuning and prompt techniques can go a long way. That + cherrypicking results

by peab

5/20/2025 at 9:30:25 PM

multishot generation with discriminators

by htrp

5/20/2025 at 8:49:08 PM

> "A dolphin is using its fluke to discipline a mermaid by paddling it across the backside."

Hmm.

by mcphage

5/21/2025 at 2:30:50 PM

They failed but man those snakes are cool. Awesome website!

by anton-c

5/20/2025 at 7:19:24 PM

How do you determine how many attempts are made before the results are failing?

by snug

5/20/2025 at 8:48:38 PM

It's listed in Purple to the right of the model name.

by mcphage

5/20/2025 at 9:32:04 PM

I think they're asking how the number to stop at was determined, not what the number stopped at was.

My guess as to determining whether it's 64 attempts to a pass for one and 5 attempts to a fail for another is simply "whether or not the author felt there was a chance random variance would result in a pass with a few more tries based on the initial 5ish". I.e. a bit subjective, as is the overall grading in the end anyways.

by zamadatix

5/20/2025 at 10:29:40 PM

That's exactly what it was. It's hard to define a discrete rubric for grading at an inherently qualitative level. Usually more attempts means that it seemed like the model had the "potential" to get across the finish line so I gave it more opportunities.

If there's only a few attempts and ends in a failure, there's a pretty good chance that I could sort of tell that the model had ZERO chance.

by vunderba

5/20/2025 at 6:42:32 PM

Awesome showcase! Fun descriptions. Are there similar sites?

by xixixao

5/21/2025 at 12:30:16 AM

Thanks! There are definitely other GenAI image comparison sites out there - but I found that the majority of them were more concerned with visual fidelity which IMHO is a less challenging problem than prompt adherence.

This is probably one of the better known benchmarks but when I see Midjourney 7 and Imagen3 within spitting distance of each other it makes me question what kind of metrics they are using.

https://artificialanalysis.ai/text-to-image

by vunderba

5/20/2025 at 9:32:23 PM

I love the writing style in this.

by zamadatix

5/21/2025 at 1:58:00 AM

The website is broken

by mvdtnz

5/21/2025 at 3:17:30 AM

That's unusual - I don't see anything in the logs and perf tests / website speed tests show everything is good. Maybe Cloudflare had a hiccup.

by vunderba

5/20/2025 at 7:00:09 PM

great website!

by peab

5/20/2025 at 6:52:48 PM

It finally feels like the professional tools have greatly outpaced the open source versions. While wan and hunyuan are solid free options, the latest from Google and Runway have started to feel like a league above. Interestingly it feels like the biggest differentiator is editing tools - ability to prompt motion, direction, cuts, or weaving in audio, rather than just pure ability to one shot.

These larger companies are clearly going after the agency/hollywood use cases. It'll be fascinating to see when they become the default rather than a niche option - that time seems to be drawing closer faster than anticipated. The results here are great, but they're still one or two generations off.

by jjcm

5/20/2025 at 9:23:03 PM

> While wan and hunyuan are solid free options, the latest from Google and Runway

The Tencent Hunyuan team is cooking.

Hunyuan Image 2.0 [1] was announced on Friday and it's pretty amazing. It's extremely high quality text-to-image and image-to-image with millisecond latency [2]. It's so fast that they've built a real time 2D drawing canvas application with it that pretty much duplicates Krea's entire product offering.

Unfortunately it looks like the team is keeping it closed source unlike their previous releases.

Hunyuan 3D 2.0 was good, but they haven't released the stunning and remarkable Hunyuan 3D 2.5 [3].

Hunyuan Video hasn't seen any improvements over Wan, but Wan also recently had VACE [4], which is a multimodal control layer and editing layer. The Comfy folks are having a field day with VACE and Wan.

[1] https://wtai.cc/item/hunyuan-image-2-0

[2] https://www.youtube.com/watch?v=1jIfZKMOKME&t=1351s

[3] https://www.reddit.com/r/StableDiffusion/comments/1k8kj66/hu...

[4] https://github.com/ali-vilab/VACE

by echelon

5/20/2025 at 7:05:01 PM

I think open source still has an important advantage in the pro environment despite being less convenient, and it's the possibility of adding things in between the generation process like control net, and custom loras with new concepts or characters.

Plus in local generation you're not limited by the platform moderation that can be too strict and arbitrary and fail with the false positives.

Yes comfy UI can be intimidating at first vs an easy to use chatgpt-like ui, but the lack of control make me feel these tools will still not being used in professional productions in the short term, but more in small YouTube channels and smaller productions.

by javchz

5/20/2025 at 9:08:00 PM

I don't think this is just about convenience - you're not going to get these results with a 14B video model. I'd much prefer to have something I could hack on in ComfyUI but the open weights models don't compete with this anymore than a 32B LLM competes with Gemini 2.5 Pro for coding. And at least in coding you can easily edit the output from the LLM regardless...

by MrScruff

5/20/2025 at 9:25:45 PM

> you're not going to get these results with a 14B video model

Foundation models are starting to outstrip any consumer hardware we have.

If Nvidia wants to stay ahead of Google's data center TPUs for running all of these advanced workloads, they should make edge GPU compute a priority.

There's a future where everything is a thin client to Google's data centers. Nvidia should do everything in its power to prevent that from happening.

by echelon

5/20/2025 at 11:12:33 PM

Your post strangely sounds like Nvidia primarily makes graphic cards for consumers.

Last time I checked, they couldn't produce enough H100s/GB100s to satisfy demand from everyone and their mother running a data center. And their most recent consumer hardware offerings have been repeatedly called a "paper launch" - probably because consumer hardware isn't a priority, given the price (and profit) delta.

by sigmaisaletter

5/21/2025 at 3:02:33 PM

I read their comment as meaning that Nvidia should prioritise a specific kind of consumer/prosumer hardware.

Nobody is running H100s at home, nor are most video companies running ones. So the choice for them is to "rent" them from Google, or... invest a lot in almost impossible to obtain Nvidia hardware? One has lower initial cost, and is available now.

by sofixa

5/21/2025 at 8:36:31 PM

Thanks for the (possible) clarification.

But as long as Google isn't their _only_ customer, why would Nvidia care?

by sigmaisaletter

5/21/2025 at 9:38:38 AM

>There's a future where everything is a thin client to Google's data centers. Nvidia should do everything in its power to prevent that from happening.

there has always been, the mainframe concept is not new. but it goes in and out of fashion.

>>>> mainframe

<<<< personalpc

>>>> web pages/social media

<<<< personal phones/edge

>>>> cloud ai

<<<< ???? personal robotics, chips and ai ???

>>>> ???? rented swarms ???

by larodi

5/20/2025 at 8:11:31 PM

Control net etc can be served via API; the intrinsic advantage of open-source is the ability to train and run inference privately.

by popalchemist

5/21/2025 at 5:08:34 AM

Someone out there might care about nudity, but unfortunately, nobody that matters.

by doctorpangloss

5/21/2025 at 3:00:31 PM

We already have seen that Opensource can compete which is a lot more than people expected. After all opensource and running huge models?

But what it means, that with time, Opensource will be as good as what commercial offerings now have. Hardware will get cheaper, research is open or delayed open.

by Flamentono2

5/20/2025 at 8:25:28 PM

> the agency/hollywood use cases.

It's for advertising.

by irq-1

5/21/2025 at 5:02:36 AM

IMO, this is a misconception. For example, in the case of social media display ads (i.e., not the typical Google text ad), most campaigns are "saturation," they only work if the creatives are seen 100+ times by the intended audience and look more or less exactly the same, which is kind of the exact opposite theory that benefits from being able to create unlimited personalized creatives.

by doctorpangloss

5/22/2025 at 12:20:22 PM

I would think there’s a major difference in the inference time commute for tools like this. And major providers can spend a lot more (at a loss) on the runtime compute. That’s just a guess though.

by lancekey

5/21/2025 at 5:38:35 AM

Has anyone cracked the nut of making videos longer than a few seconds though? No one seems to have made any progress on this. This is all nearly worthless until that is addressed.

by colordrops

5/21/2025 at 5:47:19 AM

I thought that for a while. Until it was pointed out to me that most long videos are made of 6 second shots.

Generating a long video one shot at a time kind of makes sense, as long as there's good consistency between shots

by plokiju

5/21/2025 at 6:29:41 AM

It would narrow down what you could do by quite a bit if you are limited to 6 seconds per shot. While that is average, there are many shots that are longer. Also, you bring up a good point about consistency between shots. That doesn't seem like as hard of a problem, but it's still a big one.

by colordrops

5/21/2025 at 1:42:47 PM

I don’t care if the normies dont do long shots. The best movies have very long shots. Children of men or 1917. Until AI video gen can get past 5-10 second shots of slop, we won’t see any major critically acclaimed AI movies or related.

by Der_Einzige

5/20/2025 at 11:54:03 PM

We will know GAI exists when there is no difference, because anything can be coded at any level of quality :)

by mensetmanusman

5/21/2025 at 12:11:11 AM

Well no, humans have 'natural' general intelligence and there's an obvious gap between an expert and a novice at any task.

by fooker

5/20/2025 at 7:48:40 PM

An indie film with poor production values, even bad acting can grip you, make you laugh and make you cry. The consistency of quality is key - even if it is poor. The directing is the red thread throughout the scenes. Anything with different quality levels interrupts your flow and breaks your experience. The problem with AI video content at this stage is that the clips are very good 'in themselves', just as LLM results are, but putting them together to let you engage beyond an individual clip will not be possible for a long time. It will work where the red thread is in the audio (e.g. a title sequence) and you put some clips together to support the thread. But Hollywood has nothing to fear at this stage. In addition, remember that visual artists are control freaks of the purest kind. Film is still used because of the grain, not despite it. 24p prevails.

by julianpye

5/20/2025 at 10:38:23 PM

You might want to look up NeuralViz on YouTube. 180k subscribers. They've been building out an entire cinematic universe using AI video tools. And it's by far the funniest show I've watched in years. So the claim that "let you engage beyond an individual clip will not be possible for a long time" isn't true. People are already doing it.

https://www.youtube.com/@NeuralViz

by rcarr

5/21/2025 at 8:01:40 AM

> "Lurking, Lifting, Licking"

Ok, I went from being pleasantly surprised to breakout laughter at that point.

But I also think this points out a big problem: high-quality stuff is flying under the radar simply because of how much stuff is out there. I've noticed that when faced with a lot of choice, rather than exploring it, people fall back into popular stuff that they're familiar with in a really sad way. Like a lot of door dash orders will be for McDonalds, or people will go back to watching popular series like Friends, or how Disney keeps remaking movies that people still go to see.

by preommr

5/21/2025 at 12:58:33 PM

I hadn't seen these before, but they're working because of the limitations of the technology.

The format of the shows are mostly clip-based - man on the street, news hour, etc - and obviously the jokes are all written by someone with a good sense of humour.

Not to discount that this is, as you say, an example of someone using AI to successfully create characters and stories that resonate with people. it's just still very much because of a creative human's talent and good taste that it's working.

by ikesau

5/21/2025 at 3:52:16 AM

https://www.youtube.com/watch?v=3XkrhhsV6zg

> You're not the monolith of me!

These other universe memes are too good.

by stevenhuang

5/21/2025 at 5:37:26 AM

Since the first GenAI started popping up, many people have glossed over the fact that they are just tools. All the anger from artists and keyboard warriors completely ignored the fact that you still need skill and time to make something good with these tools.

Artists aren't going to be replaced by AI tools being used by me on my iPhone, those artists were already replaced by bulk art from IKEA et al. Artists who reject new tools for being new will be replace by artists who don't. Just like many painters were replaced by photographers.

by wickedsight

5/21/2025 at 6:05:27 PM

>Artists aren't going to be replaced by AI tools being used by me on my iPhone

Except they already are.

https://societyofauthors.org/2024/04/11/soa-survey-reveals-a...

by jplusequalt

5/21/2025 at 11:50:01 PM

I don't understand the argument that artists won't be replaced. I already am using AI to generate the art I need rather than farming it out

by DoesntMatter22

5/20/2025 at 11:08:02 PM

This is the first time I've wanted more AI video content. Thanks for sharing.

by alixanderwang

5/21/2025 at 1:03:44 PM

The Dor Brothers on YouTube have also been making some very funny, stylized music videos with AI. They've managed to use the limitations to their advantage

by spaceman_2020

5/21/2025 at 3:47:38 AM

This is news is hilarious.

by goosejuice

5/21/2025 at 10:29:01 AM

[dead]

by computerthings

5/21/2025 at 1:46:08 PM

You don't need to make an entire movie out of this. One or two scenes that are difficult or impossible to film on a certain budget is enough to lift the production value of a movie. One can use this as CGI replacement, for example to produce a couple seconds scene of an ancient city and stretch that out with fake panning.

You can also use it as a communication tool such as making a "live" storyboard to prep location, blocking, maybe even as notes for actors.

by dlisboa

5/21/2025 at 2:41:53 PM

That storyboard idea is pretty huge. Imagine dailies go in the other direction - "here's how I want it to look"

by anton-c

5/21/2025 at 2:56:02 PM

Yeah, I did an amateur short once as part of a college assignment and doing the storyboard was the most difficult part for me as I'm not a good drawer at all. Getting the idea for a certain shot from my head to the paper was a struggle.

Being able to express visual ideas with words is one of the most powerful things of this AI craze. Text/code is whatever.

by dlisboa

5/20/2025 at 10:06:40 PM

AI video may be to Hollywood as photography was to painting. Photography wasn't "painting, but better" - it was a different thing. AI-native video may not resemble typical Hollywood 3-act structure. But if it takes enough eyeballs away from Hollywood then Hollywood will die all the same.

by sandspar

5/20/2025 at 10:22:57 PM

I think you're contradicting your own argument. Painting didn't die from photography.

Photography increased the abstract and more creative aspects of painting and created a new style because photography removed much of the need to capture realism. Though, I am still entranced by realist painting style myself, it is serving different purpose than capturing a moment.

by pedalpete

5/21/2025 at 5:52:39 PM

>Painting didn't die from photography.

Commercial portrait painters died out pretty fast.

by djeastm

5/20/2025 at 11:10:55 PM

I'm not an expert so I may be wrong about all this. But my impression is that Pictorialist photography aped painting for 50 years. Photography only came into its own as a "photography native" art form with Stieglitz and people like that around ~1905. By that time, non-representational painting styles like Cubism had already sucked the remaining juice out of painting, with Duchamp's 1917 urinal perhaps deserving credit for the coup de grace. Today painting is a shadow of what it once was - and public interest and auction prices reflect that. Museums occasionally have abstract painting exhibits but they're poorly attended because the public dislikes them. Ask a person on the street what their favorite painting movements are and likely every name will be more than 100 years old, possibly hundreds of years old. Compare auction prices between pre-1917 paintings and post-1917 paintings. Besides a few middlebrow pop artists like Dali or Warhol, meme painters like Pollock, or trendy political painters like Basquiat or Johns, the older paintings will be orders of magnitude more in demand. Painting used to move the conversation forward, now nobody cares.

by sandspar

5/20/2025 at 11:16:16 PM

> Ask a person on the street who their favorite painting movements are and likely every name will be more than 100 years old

I think you overestimate the publics art appreciation. The average answer will be a blank stare.

by sigmaisaletter

5/21/2025 at 1:29:38 AM

Well if you want to really be pedantic, they said every name, not average answer, so if most people reply with a blank state, that is still not in the set of every name, because they're not names. So the ones who actually do reply with a name are, as I agree with them, likely to be older than 100 years.

by satvikpendem

5/21/2025 at 10:10:27 AM

Have my upvote for pedantic overkill.

by sigmaisaletter

5/21/2025 at 4:04:55 AM

Sounds right, except calling Dali or Warhol "middlebrow." That's just weird.

by autobodie

5/20/2025 at 10:06:31 PM

There’s already more good content than anyone can watch. It’s impossible to disentangle strength of the art from strength of distribution. Google, the world’s biggest distributor of culture, is focusing on this problem they do not need to solve, instead of the one everyone in art actually suffers from, because: they’re bad at this. It’s that simple.

by doctorpangloss

5/21/2025 at 12:15:54 PM

Hollywood and other "real" films is like the 1% of video content though, as is youtube which has a top 1% of good content and a lot of shit.

AI tools used for any content will / are being used to add to the pile of shit.

by Cthulhu_

5/21/2025 at 11:48:47 PM

Most Hollywood and indie films aren't that good. I feel the complete opposite of this comment.

Id much rather start seeing individuals creating AI movies where you aren't bogged down by the need to hire actors and what bot

by DoesntMatter22

5/22/2025 at 10:39:27 AM

Sorry, but AI generated video is unwatchable. Even now, when it's really great. It just doesn't seem authentic.

by precompute

5/20/2025 at 6:41:21 PM

I'm sure by this point, and if not, pretty soon, everyone will have seen a clip of AI generated video and not thought twice about it.

Its something that is only obvious when it is obvious. And the more obvious examples you see, the more non-obvious examples slip by.

by Workaccount2

5/21/2025 at 2:24:38 AM

I saw a video today [1]. Millions of views, ten thousand comments, not a single commenter mentioned that it's AI generated.

If you look at the shadows in the background, you can see how they appear and disappear, how things float in the air, and have all the AI artifacts. The video is also slowed down (lower FPS) to overcome the length limit of AI video generator.

But the point is not how we can spot these, because it's going to be impossible, but how the future of news consumption is going to look like.

[1] https://www.tiktok.com/@calm.with.word/video/750583708327412...

by gpt5

5/21/2025 at 4:54:51 AM

Interesting. Here's the original channel - their videos all have a Picsart watermark in the bottom right.

I don't believe it's entirely fake, just enhanced.

https://www.youtube.com/@jrcollection5246/shorts

by jadamson

5/21/2025 at 2:56:17 AM

the detail around the eyes is a dead-giveaway for AI generated video

by xarope

5/21/2025 at 6:00:11 AM

Well, what does news (or any media) consumption look like now? It's been trending towards pure noise for a good while, and this is a way to further automate the generation of yet more noise.

by kilpikaarna

5/21/2025 at 11:48:05 AM

[dead]

by RobertBobert

5/21/2025 at 8:46:23 AM

The inverse follows from this and is even more scary. Soon there will be videos of something terrible happening, reported in the news, which is widely rejected as fake despite being real.

by alkonaut

5/21/2025 at 7:47:21 AM

As an artist and designer (with admittedly limited AI experience), where I feel AI to be lacking is in its poverty of support for formal descriptors. Content descriptors such as 'dog wearing a hat' are a mostly solved problem. Support for simple formal descriptors such as basic color terms and background/foreground are ok, but things like 'global contrast' (as opposed to foreground background contrast), 'negative shape', 'overlap', 'saturation contrast' etc etc... all these leave the AI models I have played with scratching their heads.

I like how Veo supports camera moves, though I wonder if it clearly recognizes the difference between 'in-camera motion' and 'camera motion' and also things like 'global motion' (e.g. the motion of rain, snow etc).

Obligatory link to Every Frame a Painting, where he talks about motion in Kurosawa: https://www.youtube.com/watch?v=doaQC-S8de8

The abiding issue is that artists (animators, filmmakers etc) have not done an effective job at formalising these attributes or even naming them consistently. Every Frame a Painting does a good job but even he has a tendency to hand wave these attributes.

by Daub

5/21/2025 at 8:25:20 PM

[flagged]

by dsadfjasdf

5/20/2025 at 6:22:49 PM

Wow, this is incredible work! Blown away at how well the audio/video matches up, and the dialogue is better sounding / on-par with dedicated voice models.

by carlosdp

5/21/2025 at 8:03:00 AM

I'd made a prediction/bet a month ago, predicting 6 months to a full 90 minute movie by someone sitting on their computer. [0]

The pace is so crazy that was an over estimation! I'll probably get done in 2. Wild times.

0: https://www.linkedin.com/feed/update/urn:li:activity:7317975...

by anilgulecha

5/21/2025 at 9:02:37 AM

It's doable now. Someone just needs to do it. With voice now it's completely doable. Just throw it all together add some effects and you've got a great movie... In theory

by DoesntMatter22

5/21/2025 at 10:58:42 AM

It's not a theory, at Cannes a feature movie has premiered that is generated entirely by AI. Made in Spain.

by jb1991

5/21/2025 at 5:08:15 PM

It's a great movie in theory. Idk how good the movie you mentioned is

by DoesntMatter22

5/21/2025 at 4:06:54 PM

can you share a link/details, please?

by anilgulecha

5/22/2025 at 3:36:31 PM

There's still a lot of work to be done. It's good at making short individual scenes but when you start trying to string them together the wheels start to come off a lot. This [0] pretty basic police raid leads to shootout video for example turns to mush pretty quick because even in the initial car ride the interior of the car's size and shape warps pretty drastically.

Feels like there's going to be a dichotomy where the individual visuals look pretty good taken by themselves but the story told by those shots will still be mushy AI slop for a while. I've seen this kind of mushy consistency hold up over the generations so far, it seems very difficult to remove becasue it relies on more context than just previous images and text descriptions to manage.

[0] https://www.reddit.com/r/ChatGPT/comments/1kru6jb/this_video...

by rtkwe

5/21/2025 at 2:00:52 PM

So what the copyright situation going to be in an ai generated movie?

My last recollection is recent case said AI generated didn’t have copyright?

by wingspar

5/21/2025 at 8:25:00 PM

I hope no copyright. Ideas are meant to be freely copied.

by BeFlatXIII

5/20/2025 at 10:08:59 PM

Google has partnered with Darren Aronofsky’s AI-Driven Studio Primordial Soup. I still don't understand why SAG-AFTRA's strike to ban AI from Hollywood studios didn't affect this new studio. Does anyone know?

by kapildev

5/20/2025 at 10:32:44 PM

Primordial Soup isn't a guild signatory, which means they aren't bound by the agreement negotiated during the strike. It also means they cannot hire guild actors for their projects, but that isn't a likely concern given the nature of the company.

by cjkaminski

5/20/2025 at 6:55:49 PM

This is technically impressive and I commend the team that brought it to life.

It makes me sad, though. I wish we were pushing AI more to automate non-creative work and not burying the creatives among us in a pile of AI generated content.

by nrjames

5/20/2025 at 7:27:43 PM

I think the non-creative work is coming... but it's harder, needs more accuracy, and just generally takes more effort. But it's 100% coming. AI today can one shot with about 80% perfection. But for use cases that need to be higher than that, that last 20% is grueling to gain. It's like taking a jet across the country, and then getting jammed in traffic while you're taking a taxi to your hotel.

by swalsh

5/20/2025 at 8:45:08 PM

80% on todo apps

by TechDebtDevin

5/20/2025 at 7:37:57 PM

There’s limited training data for physical movements. Once there is enough of that the non creative space will start getting their own LLMs.

by dyauspitr

5/20/2025 at 7:15:51 PM

The amount of gatekeeping I see when this topic is brought is outstanding! Why can't people be happy that more individuals would be soon able to create freely in a more accessible way?

Personally I can't wait to see the new creative doors ai will open for us!

by ahmedfromtunis

5/20/2025 at 8:17:57 PM

How is the requirement to use a computer and maybe pay a cloud subscription in the long term more accessible than other kinds of art? Which individuals are gatekept exactly? Before you bring up disabled people (as often happens when the term accessibility is used), know that many of them are not happy to be used as a shield for this without ever being asked and would rather speak for themselves.

I've tried AI image generation myself and was not impressed. It doesn't let me create freely, it limits me and constantly gravitates towards typical patterns seen in training data. As it completely takes over the actual creation process there is no direct control over the small decisions, which wastes time.

Edit: another comment about a different meaning of accessibility: the flood of AI content makes real content less accessible.

by alpaca128

5/21/2025 at 12:17:32 AM

>How is the requirement to use a computer and maybe pay a cloud subscription in the long term more accessible than other kinds of art?

Because many other kinds of art require thousands of hours to learn before getting to the level of current AI

The real gate keeper to art isn't the cost of a pencil, it's the opportunity cost of learning how to use it

Some people have creative ideas they cannot realise and tools like AI help them do it. The more people that can realise their creative ideas the better it is for everyone.

by alickz

5/21/2025 at 1:48:42 AM

so its for lazy people? who don't want to learn a skill? there are many ways to realize your creativity, and now you have to write your "prompt" for your creative result? then why not just write a story? like authors have been doing for ever ?

by emkoemko

5/21/2025 at 6:13:54 PM

>The more people that can realise their creative ideas the better it is for everyone

There is a fundamental misunderstanding of creating going on here. There's a reason why people always talk about the work/journey/process rather than the end goal. That's what makes someone an artist--not the end result.

by jplusequalt

5/21/2025 at 8:29:11 PM

do you think taking a photo of something is skipping to the end goal of painting? no. it will just be used as a tool for more creativity for the people who will do the work.

by dsadfjasdf

5/21/2025 at 9:10:00 PM

>do you think taking a photo of something is skipping to the end goal of painting

Going outside and taking a photo requires you to be engaging with the scene around you. You are the actor doing the thing. When you prompt an AI, the AI is the actor doing the thing.

by jplusequalt

5/22/2025 at 12:15:54 AM

Would you consider film directors to be artists?

by alickz

5/22/2025 at 12:24:00 PM

Yes! Directors orchestrate hundreds of people and other creatives, have a say in how scenes are shot, how actors perform their lines, which music gets used, which lines need to be changed or added, etc.

Not to mention that many directors also write the script for the film.

by jplusequalt

5/22/2025 at 4:15:10 PM

Wouldn't all this hold true for AI videos as well, it's just that you'll be going through hundreds of prompts and bad generations before you get to something approaching your vision?

N.B. If directing people is the distinguishing factor, then animation directors are robbed of their claim to artistry as well, as is basically any solo artist in history

by I-M-S

5/22/2025 at 6:41:01 PM

I never said directing people is the distinguishing factor. I only said a director is an artist as they are deeply involved in the creation of the art in question. There is no hidden layer of linear algebra that obfuscates the creative process and that does the majority of the work.

by jplusequalt

5/23/2025 at 9:23:07 PM

this view is really limited. like I said, people who will do the work... will. You are thinking of the majority of lazy people who aren't producing anything of real value in the design/art world already.

by dsadfjasdf

5/21/2025 at 8:26:52 PM

I don't care at all about the philosophy of art. I have creative goals in mind and not enough time to become competent at all the prerequisite skills to master them.

by BeFlatXIII

5/21/2025 at 9:08:31 PM

>I have creative goals in mind and not enough time to become competent at all the prerequisite skills to master them

You are consuming, not creating.

by jplusequalt

5/22/2025 at 3:11:57 PM

Cool story; don't care.

by BeFlatXIII

5/21/2025 at 1:46:53 AM

> How is the requirement to use a computer and maybe pay a cloud subscription in the long term more accessible than other kinds of art?

Sometimes I feel like HN comments are working so hard to downplay AI that they’ve lost the plot.

It’s more accessible because you can accomplish amazing storytelling with prompts and a nominal amount of credits instead of spending years learning and honing the art.

Is it at the same level as a professional? No, but it’s more than good enough for someone who wants to tell a story.

Claiming that computer access is too high of a bar is really weird given that computers and phones (which can also operate these sites) are ubiquitous.

> Edit: another comment about a different meaning of accessibility: the flood of AI content makes real content less accessible.

No it does not. Not any more than another person signing up for YouTube makes any one channel less “accessible”. Everyone can still access the content the same.

by Aurornis

5/20/2025 at 10:43:12 PM

Try and tell me that a single individual would have been able to create this 10 years ago on a normal salary:

https://www.youtube.com/@NeuralViz

It would have cost millions. Now one person can do it with a laptop and a few hundred dollars of credits a month.

AI is 100% making filmmaking more accessible to creative people who otherwise would never have access to the kind of funding and networks required to realise their visions.

by rcarr

5/21/2025 at 12:50:13 AM

Since 10+ years ago various YT channels have been creating interesting content, sometimes on less than a normal salary, filmed with a phone and with one person acting for three characters at once. And imho they (fortunately) won't disappear if what you linked is representative for what AI enables.

None of the videos I've clicked on required AI for the content to be good, and some of the randomness has no real reason to be there.

by alpaca128

5/21/2025 at 7:38:04 AM

Most of those "skits" can be done on a cheap camera with a single person being the actor. In fact, many such videos existed already in the past.

Also, they're painfully unoriginal. They're just grabbing bits that The Onion or shows like Rick & Morty have been doing and putting a revolting AI twist to it. It screams to me of 0 effort slop made for the sole purpose of generating money from morons with no creativity clicking on it and being bemused for 10 seconds

by sensanaty

5/21/2025 at 1:07:50 AM

> How is the requirement to use a computer and maybe pay a cloud subscription in the long term more accessible than other kinds of art?

Accessibility -- and I don't mean this in the sense particular to disability -- is highly personal; its not so much that it is more accessible, as that it is differentlly-accessible.

> I've tried AI image generation myself and was not impressed. It doesn't let me create freely, it limits me and constantly gravitates towards typical patterns seen in training data. As it completely takes over the actual creation process there is no direct control over the small decisions, which wastes time.

No offense, but you've only tried the most basic form of AI image generation -- probably something like pure text-to-image -- if that's what you are finding. Sure, that's definitiely what the median person doing AI image gen does, dumping a prompt in ChatGPT or Midjourney or whatever. But its not all that AI image generation offers. You can have as much or as little control of the small (and large) decisions as you want when using AI image generation.

by dragonwriter

5/20/2025 at 9:10:42 PM

There's a pretty standard argument that creating artworks should or must require the hardship of developing the skill to bring vision into reality, or paying someone who can. That can be debated, but the position is textbook gatekeeping.

Other disapproval comes from different emotional places: a retreading of ludditism borne out of job insecurity, criticism of a dystopia where we've automated away the creative experience of being human but kept the grim work, or perceptions of theft or plagiarism.

Whether AI has worked well for you isn't just irrelevant, but contrarian in the face of clear and present value to a lot of people. You can be disgusted with it but you can't claim it isn't there.

by Clamchop

5/21/2025 at 12:28:21 AM

Gatekeeping means restricting power or opportunities. Is it gatekeeping when a skill can be learned for free, and on top is also optional? Is it also gatekeeping to forbid students from using ChatGPT for homework because it would force them to go through the hardship of earning an education?

I have seen what 99% of people are doing with this "clear and present value". Turns out when you give people a button to print dopamine they probably aren't going to create the next Mona Lisa, they're just going to press the button more. Even with AI, creating compelling art is still a skill that needs to be learned, and it's still hard. And why would they learn a skill when they just decided against learning a skill? Incentives matter, and here the incentives massively favor quantity over quality.

> Whether AI has worked well for you isn't just irrelevant

My point was that it's creatively restrictive, and that current models tend to actively steer away from creative outputs. But if you want to limit yourself to what corporations training the models and providing the cloud services deem acceptable, go ahead.

by alpaca128

5/21/2025 at 12:50:49 AM

I stated exactly what the enumeration of arguments that I know of are and which one is gatekeeping. (It's still the first one, devaluing or dismissing work out of hand if AI was involved, regardless of other merit.)

by Clamchop

5/21/2025 at 12:32:17 PM

Good, it's a socially corrosive technology. Much like you might dismiss a quality product made with child labor.

by hooverd

5/21/2025 at 6:39:17 PM

Terrible analogy that trivializes child abuse.

by Clamchop

5/21/2025 at 1:48:23 AM

> Gatekeeping means restricting power or opportunities.

Gatekeeping commonly means excluding others from a group, a label, or an identity. That’s what they’re referring to.

by Aurornis

5/20/2025 at 7:53:11 PM

Remember how there were all those cake shows all of a sudden, and they were making cakes that looked super pretty, but they were just fondant and sheet cakes? We're not thrilled having to wade through the AI equivalent.

by _DeadFred_

5/20/2025 at 8:49:15 PM

> Why can't people be happy that more individuals would be soon able to create freely in a more accessible way?

The gates are wide open for those that want to put in effort to learn. What AI is doing to creative professionals is putting them out of a job by people who are cheap and lazy.

Art is not inaccessible. It's never been cheaper and easier to make art than today even without AI.

> Personally I can't wait to see the new creative doors ai will open for us!

It's opening zero doors but closing many

---

What really irks me about this is that I have _seen_ AI used to take away work from people. Last weekend I saw a show where the promotional material was AI generated. It's not like tickets were cheaper or the performers were paid more or anything was improved. The producers pocketed a couple hundred bucks by using AI instead of paying a graphic designer. Extrapolate that across the market for arts and wonder what it's going to do to creativity.

It's honestly disgusting to me that engineers who don't understand art are building tools at the whims of the financiers behind art who just want to make a bit more money. This is not a rising tide that lifts all ships.

by duped

5/20/2025 at 9:28:51 PM

> The gates are wide open for those that want to put in effort to learn.

Why is effort a requirement?

Why should being an artist be a viable job?

Would you be against technology that makes medical doctors obsolete?

by ahtihn

5/20/2025 at 9:47:26 PM

> Why is effort a requirement?

That's how human brains work. People have an intrinsic need to sort, build hierarchies and prioritize. Effort spent is one of viable heuristics for these processes.

> Why should being an artist be a viable job?

Art itself has great value, if it weren't, museums, theaters and live shows wouldn't exist.

> Would you be against technology that makes medical doctors obsolete?

The analogy doesn't work. The results of a medical process is a [more] healthy person. The result doesn't have any links to the one performing it. Result of an artistic creative process is an art piece, and art is tied to its creator by definition.

by ZoomZoomZoom

5/20/2025 at 11:08:30 PM

I think the result of an art piece is "wow that looks pretty cool / makes me feel this certain way". Human properties aren't required there. Sure, some art rests on there being human properties, but then AI wouldn't be able to replace it by definition. You can argue that AI lowers the discoverability of art that falls in the latter category, but I'd say that it's a solvable problem that can be fixed with better recommendation algorithms.

by jackphilson

5/21/2025 at 11:55:47 AM

> I think the result of an art piece is "wow that looks pretty cool / makes me feel this certain way".

This is an expansive definition and thus not useful, because it would include:

1. Natural phenomena (sand on a vibrating plate is pretty cool).

2. Folk crafts (this hand-woven rug sure ties the room together!).

3. Advertisement.

4. Industrial design (this soap dispenser looks like a droid head, awesome!).

5. Drug induced experiences.

6. Art forgery and plagiarism.

Nothing in the list is really art. Rough definition of art is an intentional process (or the results thereof) of self-expression, and/or interpretation/modeling of reality performed with symbolic means. This implies intentionality and a conscience, which current "AI" doesn't have.

> AI lowers the discoverability of art that falls in the latter category, but I'd say that it's a solvable problem that can be fixed with better recommendation algorithms.

Theoretically it is. However, it won't be ever solved and implemented widely due to the lack of incentives and the fact that just replacing it all with "AI" is much more profitable and exploitable.

by ZoomZoomZoom

5/20/2025 at 11:31:59 PM

I would be against technology that freezes medicine at our current understanding and makes it economically unviable to develop new medicine.

I don't think it's worthwhile to explain the inherent value of human created art or that to learn how to do it one must put some effort into it. All I can say is, if you are one of those people who do not understand art, please don't build things that take away someone else's livelihood without very good reason.

I don't think the majority of AI generation for art is useful for anything but killing artists.

by duped

5/21/2025 at 6:30:00 AM

On effort being a requirement: part of art is around playing with limits of a medium and finding a way in it, it takes a lot of trials, attempts, and errors for an artist to make their way. It's not a requirement per se, but something needed for someone to intent something different. Not worried about creative ways where artists explore AI, new things will come out of it and it's going to be interesting. Not worried either about post-modernists who already dropped requirements long time ago and tape bananas to walls, they'll find their way. But the category artists who make their way through the effort put in a medium, not only the narrative around the medium will be affected.

On jobs: craftsmanship is slightly different than art: industries are built with people who can craft, there is today an artistic part in it but it's not the essence of the job: the ads industry can work with lower quality ads provided they can spam 10x. There is however an overlap between art/craftmanship: a lot of people working in these industries can today be in a balance where they live with a salary and dedicate time to explore their mediums. We know what will happen when the craftmanship part is replaced by AI, being an artist will require to have the balance in the first place.

It feels like a regression: it leads to a reduction of ideas/explorations, a saturation of the affected mediums, a loss of intent. Eager to see what new things come out of it though.

by aimxhaisse

5/21/2025 at 12:30:25 AM

> Why should being an artist be a viable job?

Because we as a society have valued it as one for eternity.

by Klonoar

5/21/2025 at 4:25:54 AM

I think effort is a signal that whatever the thing is, the artist/team -really- wanted to put it out there and have people view it. They had -something- that drove them to take the time and effort to make the thing.

Zero-effort output generators like prompting means people are just generating trash that they themselves don't even care about. So why should I take my time to watch/experience that?

The whole "GenAI is accessible" sentiment is ridiculous in my opinion. Absolutely nothing is stopping people from learning various art mediums, and those are skills they'll always have unlike image generators which can change subscription plans or outright the underlying model.

Absolutely no one should be lauding being chained to a big corp's tool/model to produce output.

---

Why should being an artist be a viable job? Well, people should get paid for their work. That applies to all domains except technical people love to look down on art while still wanting to watch movies, well produced youtube videos, etc. You can see it in action here on HN frequently: someone will link a blog post they took time to write and edit... and then generate an image instead of paying an artist. They want whatever effect a big header image provides, but are not willing to pay a human to do it or do it themselves. Somehow because it's "just art" it's okay to steal.

---

If tech has progressed to the point of true "general artificial intelligence", then likely all jobs will be obsolete and we're going to have to rethink this whole capitalism thing.

I think all industries should be utilizing tech to augment their capabilities, but I don't think actual people should be replaced with our current guesstimator/bullshitter "AI". Especially not critical roles like doctors and nurses.

by huimang

5/21/2025 at 12:33:41 PM

Why do you want to destroy social mobility? Do you think you'll be on the safe side of the line?

by hooverd

5/21/2025 at 1:52:54 AM

i think real art will just go underground kind of like how it was pre internet, and the internet will be filled with AI slop

by emkoemko

5/20/2025 at 7:36:46 PM

There's nothing creative in having someone or something else doing the work for you.

"Creating" with an AI is like an executive "inventing" the work actually done by their team of researchers. A team owner "winning" a game played by the their team.

That being said, AI output is very useful for brainstorming and exploring a creative space. The problem is when the brainstorming material is used for production.

by gamblor956

5/20/2025 at 8:26:21 PM

The first two paragraphs of your argument could be used to discuss whether Photography (Camera is doing most of the work) or Digital Drawing (Photoshop is doing most of the work) are art.

Both things which were dismissed as not art at first but are widely accepted as an art medium nowadays.

by jampa

5/20/2025 at 8:46:05 PM

I see this comparison to a camera a lot but I don't think it works (not that you're saying this, I'm just contributing). I'm not an expert but to me the camera is doing very little of the work involved in taking an artistic picture. The photographer chooses which camera to use to get a certain effect, which lenses, the framing, etc. All the camera is doing recording the output of what the person is specifying.

by MattGrommes

5/21/2025 at 3:53:11 AM

I think there's a sliding scale in both cases. Vanilla prompting something like DALL-E 3 and uncritically accepting what it spits out is the AI equivalent of dime-a-dozen smartphone snapshots of the Eiffel Tower or an ocean sunset. But like your description of professional photography, there are more intricate AI approaches where an expert user can carefully select a model, a fine-tune/LORA, adjust the temperature or seed, inpaint or layer different elements, and of course have the artistic vision to describe something interesting in the first place.

by Jordan-117

5/20/2025 at 9:57:49 PM

Photography mostly eliminated the once-indispensable portrait artist, among other formerly-dependable lines of work.

There's a line to be drawn somewhere between artist and craftsperson. Creating beautiful things to a brief has always been a teachable skill, and now we're teaching it to machines. And, we've long sought to mass-produce beautiful things anyway. Think textiles, pottery, printmaking, architectural adornments.

Can AI replace an artist? Or is it just a new tool that can be used, as photography was, for either efficiency _or_ novel artistic expression?

by Clamchop

5/21/2025 at 12:00:59 AM

Billions of R&D and millions of man hours made the camera exist. It’s doing most of the embodied work.

by mensetmanusman

5/20/2025 at 9:59:46 PM

Everyone has a phone camera, and takes photos, but not everyone is a photographer, and even photographers wouldn’t proclaim all their photos “art”.

AI cannot “democratize art” any more than the camera did, until the day it starts teaching artistry to its users.

by blargey

5/20/2025 at 11:02:25 PM

> until the day it starts teaching artistry to its users.

It almost definitely can start teaching artistry to its users, and the same people who are mad in this thread will be mad that it's taking away jobs from art instructors.

The central problem is the same and it's what Marshall Brain predicted: If AI ushers in a world without scarcity of labor of all kinds, we're going to have to find a fundamentally new paradigm to allocate resources in a reasonably fair way because otherwise the world will just be like 6 billionaire tech executives, Donald Trump, and 8 billion impoverished unemployed paupers.

And no, "just stop doing AI" isn't an option, any more than "stop having nuclear weapons exist" was. Either we solve the problems, or a less scrupulous actor will be the only ones with the powerful AI, and they'll deploy it against us.

by xp84

5/20/2025 at 9:33:12 PM

The first two paragraphs of your argument could be used to discuss whether Photography (Camera is doing most of the work) or Digital Drawing (Photoshop is doing most of the work) are art.

The work a camera does is capturing the image in front of the photographer. "Art" in the context of photography is the choice of what in the image should be in focus, the angle of the shot, the lighting. The camera just captures that; it doesn't create anything that isn't already there. So, not even remotely the same thing as AI Gen.

The work of Krita/Inkscape/etc (and technically even Photoshop) is to convert the artistic strokes into a digital version of how those strokes would appear if painted on a real medium using a real tool. It doesn't create anything that the artist isn't deliberately creating. So, not even remotely the same thing as AI Gen.

AI Gen, as demonstrated in the linked page and in the tool comparison, is doing all of the work of generating the image. The only work a human does is to select which of the generated images they like the best, which is not a creative act.

by gamblor956

5/20/2025 at 7:47:14 PM

I don't think that's true. Is a film director not a creative?

You could come up with your own story and direct the AI to generate it for you.

by andoando

5/20/2025 at 9:40:50 PM

In your example, the "come up with your own story" part is the creative part. But you're not "directing" the AI to generate it for you. You're just giving it a command. You're selecting from the results it outputs, but you're not controlling the output.

A film director is a creative. Ultimately, they are in charge of "visualizing" a screenplay": the setting, the the design of the set or the utilization of real locations, the staging of the actors within a scene, the "direction" of the actors (i.e., how they should act out dialog or a scene, lighting, the cinematography, the use of stunts, staging shots to accommodate the use of VFX, the editing (meaning, the actual footage that comprises the movie).

There's an old show on HBO, Project Greenlight, that demonstrates what a director does. They give 2 directors the same screenplay and budget and they make competing movies. The competing movies are always completely different...even though they scripts are the same. (In the most extreme example from one of the later seasons, one of the movies was a teen grossout comedy, and the competing movie was some sort of adult melodrama.)

by gamblor956

5/20/2025 at 9:51:13 PM

So 1. being able to bring your own story come to life automatically is cool in itself, and would result in a lot of creative media that is not possible now. Do you know how many people have their own stories, plays, etc that are dying to find someone rich enough to get them published?

2. Using AI can be can be an iterative process. Generate this scene, make this look like that, make it brighter colors, remove this, add this, etc. That's all carefully crafting the output. Now generate this second scene, make the transition this way, etc. I don't see how that's at all different from a director giving their commands to workers, except now you actually have more creative control (given AI gets good enough)

by andoando

5/21/2025 at 11:21:03 PM

1. We already have that now, it's called Word. Most people are just too lazy to write out their story. AI doesn't improve the situation, it makes it worse. It will become vastly harder to find the good stuff in the avalanche of crap.

2. Current AI can't do what you're describing, so the biggest difference is that you're posing a hypothetical against the real world. But more specifically: the director already has a specific vision in their hand; the purpose of the "direction" is to bring this vision into reality within the scope of their budget and resources. With AI, you have a general idea and the AI creates its own vision and you pick what you like the best, until you ultimately realize the AI isn't going to get what you actually want and you settle for the best the AI can do for you. So, completely different.

by gamblor956

5/21/2025 at 12:25:23 AM

>But you're not "directing" the AI to generate it for you. You're just giving it a command.

That's what direction is though. Film directors prompt their actors and choose the results they like best (among many other commands to many other groups)

>You're selecting from the results it outputs, but you're not controlling the output.

The prompt controls the output (and I bet you'd have more control over the AI than you'd have over a drunk Marlon Brando)

by alickz

5/20/2025 at 7:50:42 PM

Not even if you are directing and refining it? What if i smudge out sections repeatedly and over the course of say 20 iterations produce a unique image that matches closely what i am imagining, and that has not be seem before?

by cloverich

5/20/2025 at 7:45:24 PM

> There's nothing creative in having someone or something else doing the work for you.

This would include almost everyone who’s used any editing software more advanced than photoshop CS4.

by MichaelZuo

5/20/2025 at 8:05:39 PM

Buñuel would disagree with you: "The peak of film-making will be reached when you are able to take a pill, switch off the lights, sit facing a blank wall and project on it, directly from your eyes, the film that passes through your head."

by kmijyiyxfbklao

5/21/2025 at 2:36:41 PM

No need for any pills. Such tech exists for centuries and is colloquially known as "reading books". It requires some lighting though

by GenshoTikamura

5/20/2025 at 8:12:23 PM

[dead]

by GuinansEyebrows

5/21/2025 at 9:57:44 AM

if I wasnt around to witness it over the last 10 years I would have thought most commenters on HN were bots pretending to be offended and gatekeeping for obscure profit motives.

So the bad news is people are just insecure, jealous, pedantic, easy to offend, highly autistic - and these are the smart ones.

The good news, is with dead internet theory they will all be replaced with bots that will atleast be more compelling make some sort of sense.

by ionwake

5/21/2025 at 1:31:55 PM

God forbid people have an opinion

"Oh you're posting on hackernews, if you don't suck google's dick and every single gadgets megacorps shit out you must be highly autistic".... interesting take

by lm28469

5/21/2025 at 1:46:55 PM

This is objectively one of the most autistic websites on the internet. Likely worse than 4chan /g/. Just as dang and PG like it though!

by Der_Einzige

5/22/2025 at 12:22:53 PM

Sure is! And it’s great.

Sorry I being autistic didn’t even phrase it well I think people got too offended I was just saying the bad side to it, HN is great.

Couple of people were so upset at the suggestion they replied in a defensive manner I should have been more careful with my rant

by ionwake

5/21/2025 at 2:04:24 PM

If you call motives of designers, painters, musicians, filmmakers and other c-word professionals obscure, then I have bad news for whatever occupation you have for a living. Maybe it's just because it depends on how well AI sells, or because there's suddenly a perceived window of opportunity wide open at some fee for everyone unable to c-word anything on their own to finally become able, even if by relying on corporate crutches, the window you deem endangered by said gatekeepers?

by GenshoTikamura

5/21/2025 at 1:33:33 PM

"happy", "free", "creative", "accessible"

What a weird way to spell "give $200 a month to google"

by lm28469

5/20/2025 at 8:52:43 PM

Individuals won’t be able to do anything. The artist here is the LLM. There is no AI art where the human in the loop carries any significance. Proof of that is you can’t replicate their work using tbt same LLM. In AI art, the AI is the artist. The human is just a client making a request.

And who owns the AI?

It’s delusional. Stop falling for the mental jiu Jitsu from the large AI labs. You are not becoming an artist by using a machine to make art for you. The machine is the artist. And you don’t own it.

by kranke155

5/20/2025 at 7:19:50 PM

Creative? There's nothing creative in it.

by StefanBatory

5/20/2025 at 7:26:44 PM

It's funny that you're defending creativity by being close-minded about a creative new way to explore it. You're being your judgment of an entire new medium based on a few early examples. It's as if you're downplaying photography just based on the few first blurry, dark clichés produced. Let's keep our minds open to new forms of creativity if we really care about it so much.

by ahmedfromtunis

5/20/2025 at 7:28:08 PM

What grandparent means is that the AI enables human creativity in a new way.

by kleiba

5/21/2025 at 2:42:05 AM

"Rap isn't even music, they aren't even singing!"

Rap is really the best example of how stupid these discussions are.

The language models are amazing at rhyming so by the logic then anyone can become a rapper and that is going to put current rappers out of business.

Only someone who has never tried to rap could possibly believe this. Same thing with images, same thing with music, same thing with literally everything involving human creativity.

by rxtexit

5/20/2025 at 10:15:33 PM

It does but so does it encourage efortlessness and haste. It definitely is something new, it remains to be see whether creatives bring it to new heigts as with other mediums. I remain a bit skeptical but am open to it. One thing is certain, a tsunami of content is upon us.

by onemoresoop

5/21/2025 at 3:17:49 PM

Finally figured out what gp stands for. Ty.

by anton-c

5/20/2025 at 7:07:02 PM

> burying the creatives among us in a pile of AI generated content.

Isn't the creativity in what you put in the prompt? Isn't spending hundreds of hours manually creating and rigging models based on existing sketch the non-creative work that is being automated here?

by ehsankia

5/20/2025 at 7:15:42 PM

How does a prompt describe creativity? It's a vision so far off that it's so frustrating because greater creativity came from limited tools, greater creativity came from imperfections, a different point of view, love, a slightly off touch of a painter or a guitar player, the wood of the instrument and the humidity affecting. I can go on and on, prompts are a reduction to the minimum term of everything you'd want to describe, no matter how much you can express via a prompt

by mirkodrummer

5/21/2025 at 1:47:00 AM

I agree that limitation does lead to some creativity, but I wouldn't say lack of limitation means no creativity. Saying prompts have no creativity is like saying books or scripts have no creativity compared to a movie. Not only that, these tools can actually take images and sketches.

Imagine a world where you have a scene fully sketched out in your head (i.e. creativity), you have the script of what will happen, sketches of what the scene looks light, visual style, etc. You want to make that become reality. You could spend a ton of time and money, or you could describe it and provide sketches to an AI to make it come true.

Yes, the limitations in the former can make you take creative shortcuts that could themselves be interesting, but the latter could be just as true to your original vision.

by ehsankia

5/21/2025 at 6:22:32 AM

Imagine a world where(spoiler alert you shouldn't it's already there) your vision is limited by yourself and a light technician gives a different point of view that you like, a punk pioneer in your team that is the computer graphics chief animator that gives life to your animatronics(jurassic park, and still unmatched today, cgi looks so fake and cheap). And yes you should spend a lot of time, not necessarily money, because art is time and something you have spent zero time in it its valued zero money

by mirkodrummer

5/21/2025 at 3:04:24 AM

Huh? is poem not creative? If I write a poem and tell AI to create a painting that expresses that poem visually, is that not creative?

by jryle70

5/21/2025 at 6:14:12 AM

Your hand that moves the brush on the painting will express more than thousand words, it's your hand, your movement, your actual emotions, your tools

by mirkodrummer

5/21/2025 at 6:08:48 PM

No it's not because you aren't creating the painting, the AI is. Having an idea is not sufficient to create art, you have to sit down and actually create it. If you hired a painter to create the work based off your poem, I'm not going to attribute that painters creation to you.

by jplusequalt

5/20/2025 at 7:21:52 PM

So deliberately writing a prompt that meticulously describes how a generated photo would look like isn't creative, but pushing a button for a machine to take the photo for you is??!! If anything, it's the way around!

Of course that's not what I believe, but let's not limit the definition of what creativity based on historical limitations. Let's see what the new generation of artists and creators will use this new capability to mesmerize us!

by ahmedfromtunis

5/20/2025 at 10:08:47 PM

No, it isn’t, because the prompt doesn’t have a millionth of the information density of the output.

Merely changing a seed number will provide endless different outputs from the same single prompt from the same model; rng.nextInt() deserves as much artist credit as the prompter.

by blargey

5/20/2025 at 8:36:56 PM

Your meticulous prompt is using the work of thousands of experts, and generating a mashup of what they did/their work/their commitment/their livelihood.

Their placement of books. Their aesthetic. The collection of cool things to put into a scene to make it interesting. The lighting. Not yours. Not from you/not from the AI. None of it is yours/you/new/from the AI. It's ALL based underneath on someone else's work, someone else's life, someone else's heart and soul, and you are just taking it and saying 'look what I made'. The equivalent of a 4 year old being potty trained saying 'look I made a poop'. We celebrate it as a first step, not as the friggen end goal. The end goal is you making something uniquely you, based on your life experience, not on Bob the prop guys and Betty the set designer whose work/style you stole and didn't even have the decency to reference/thank.

And your prompt won't ever change dramatically, because there isn't going to be much new truly creative seedcorn for AI to digest. Entertainment will literally go into limbo/Groundhog Day, just the same generative, derivative things/asthetics from the same AI dataset.

by _DeadFred_

5/20/2025 at 9:14:20 PM

And that's exactly how your brain work. What you call "creativity" is nothing more than exactly that: mixing ideas and thoughts you were exposed to. We're all building on others' work. The only difference is that computers do it on a much larger scale. But it's the very same process.

by ahmedfromtunis

5/20/2025 at 9:42:26 PM

This is completely absurd and reductive point of view, which I always assume is a cop out. Just because it's called "machine learning" doesn't mean it actually has anything to do with how human learning or human brain works, and it's certainly not "exactly how" or "very same". There's much more going on on in human creative process, aside from mere "mixing": personal experience, understanding of the creative process, technique and style development, subtext, hidden ideas and nuances, etc. Computers are very good at mixing and combining, but this is not even close to what goes into actual creative process. I hate this argument

by renerick

5/21/2025 at 3:13:38 AM

> personal experience, understanding of the creative process, technique and style development, subtext, hidden ideas and nuances

All of these are just human being exposed more to life and learning new skills, in other words -- having more data. LLM already learns those skills and encounters endless experience of people in its training data.

> I hate this argument

That's very subjective. You don't know how the brain works.

by jryle70

5/21/2025 at 6:30:44 AM

You seem to not fully understand the quote. LLMs learn patterns/noises from an existing output, these are not skills nor endless experience learned. It's like saying you learn how to make a cake by learning how it should look like not how it's composed. LLMs mock the studio ghibli style images, didn't invent the style or learned the endless experience the studio accumulated over the years. In fact it's a mocking of the images and it just looks horrible

by mirkodrummer

5/21/2025 at 7:35:08 AM

It's not just more data, it's deeper understanding of the fundamentals, of the idea and of the tools used, as well as the process of creation itself. It's what makes studying art interesting: why did author chose to do this and that, what's their style, what was the process, etc. For LLM the answers will universally be "because it was in the prompt and there was appropriate training data" and "the author prompted the model until the model returned something tolerable". You may argue that not all art has or needs depth, or that not all people are interested in it, but that doesn't mean that we should fill our cultures with empty boring slop.

> That's very subjective

I was expressing my opinion of this argument which absolutely is subjective

> You don't know how the brain works.

Neither does grandparent comment's author, didn't stop them from making much bolder claims.

by renerick

5/20/2025 at 10:37:16 PM

But you aren't being creative here. Just using the 'average' of tons of actually creative peoples work to create an 'average' computer predicted scene. The opposite of art. Warhol already did it and did it better.

If I see a painting, I see an interpretation that makes me think through someone else's interpretation.

If I see a photograph, I don't analyze as much, but I see a time and place. What is the photographer trying to get me to see?

If I see AI, I see a machine dithered averaging that is/means/represents/construes nothing but a computer predicted average. I might as well generate a UUID, I would get more novelty. No backstory, because items in the scene just happened to be averaged in. No style, just a machine dithered blend. It represents nothing no matter the prompt you use because the majority is still just machine averaged/dithered non-meaning. Not placed with intention, focused with real vision, no obvious exclusions with intention. Just exactly what software thinks is the most average for the scene it had described to it. The better AI gets, the more average it becomes, and the less people will care about 'perfectly average' images.

It won't even work for ads for long. Ads will become wild/novel/distinct/wacky/violations of AI rules/processes/techniques to escape and belittle AI. To mock AI. Technically perfect images will soon be considered worthless AI trash. If for no other reason than artists will only be rewarded for moving in directions AI can't going forward. The second Google/OpenAI reach their goal, the goal posts will move because no one wants procedural/perfectly average slop.

by _DeadFred_

5/21/2025 at 2:07:03 AM

artist have a style,you can see a work of art and know who made it, with these AI images its all random all over the place no direction, they can call them self's artists but i will never see them as that

by emkoemko

5/20/2025 at 8:45:54 PM

Eh, every AI "artist" want the cachet of being an artist without any of the effort, but they're competing with other AI "artists" so they have no choice but to unleash a firehose of content sludge onto the commons in a race to the bottom.

by hooverd

5/20/2025 at 7:23:07 PM

To answer your first question: carpe diem! And historical limitations? Go visit the Sistine Chapel, unmatched still today

by mirkodrummer

5/20/2025 at 9:33:19 PM

For better or worse, a big chunk (if not most) of the AI development probably does go into non-creative work like matching ads against users and ranking search results

It's just not what gets the exciting headlines and showcases

by dktp

5/20/2025 at 7:17:15 PM

Distribution of art (particularly digital) is a recent phenomenon. Prior to that, art in human history was one-off. Are we just going back to that time?

Similarly with music, prior to recording tech, live performance was where it was at.

You could look at the digital era as a weird blip in art history.

by sarks_nz

5/21/2025 at 6:10:13 AM

It's definitely coming. Creative work is first because there's zero constraints on it. Doing non-creative work, you're bound to hit a constraint - real world or otherwise - immediately, and AI is only just starting to navigate that.

by mindwok

5/20/2025 at 11:50:26 PM

Data for the non creative work isn’t as easy to, uh, “obtain” from others without their consent.

by owlboy

5/21/2025 at 2:31:39 AM

The fact that so many feel the same way about this technology (I do too!) is an indictment of humanity, not the technology itself.

We _could_ use this to empower humans, but many of us instinctively know that it will instead be used to crush the human spirit. The end result of this isn’t going to be an expansion of creative ability, it’s going to be the destruction of creative jobs and the capture of these creative mediums by a few large companies.

by skepticATX

5/21/2025 at 5:55:58 AM

> The end result of this isn’t going to be an expansion of creative ability, it’s going to be the destruction of creative jobs and the capture of these creative mediums by a few large companies.

I agree , but that's the negative. The positive will be that almost any service you can imagine (medical diagnosis, tax preparation, higher education) will come down to zero, and with a lag of perhaps a decade or two it will meet us in the physical world with robo-technicians, surgeons and plumbers. The cost of building a new house or railway will plummet to the cost of the material and the land, and will be finished in 1/10 of the time it takes today. The main problem to me is that there's a lag between the negatives and the positives. We're starting out with the negatives and the benefits may take a decade or two to reach us all equally.

by weatherlite

5/21/2025 at 6:12:47 AM

> The positive will be that almost any service you can imagine (medical diagnosis, tax preparation, higher education) will come down to zero

Why would you want massive amounts more of those things? In fact I might even argue that medicine, taxation and education are a net negative on society already. And that to the extent that there seems to be scarcity, it's mainly a distribution problem having to do with entrenched interests and bureaucracy.

> The cost of building a new house or railway will plummet to the cost of the material and the land

That's is the actual scarcity tho.

by kilpikaarna

5/21/2025 at 6:37:01 AM

> Why would you want massive amounts more of those things? In fact I might even argue that medicine, taxation and education are a net negative on society already. And that to the extent that there seems to be scarcity, it's mainly a distribution problem having to do with entrenched interests and bureaucracy.

I'm not sure what you mean. In my country getting a specialist to take a look at you can take weeks, the scarcity is that there's not enough doctors. For sure many people get delayed and suboptimal diagnosis (even if you finally get to see the specialist, he may have 10 mintues for you and 50 other patients to see that day). A.I can simply solve this.

> The cost of building a new house or railway will plummet to the cost of the material and the land

>> That's is the actual scarcity tho

Not necessarily, the labor costs a tremendous amount, and also it might be that we don't need to cram tens of millions of people around cities anymore if most work is automated, we can start spreading out (again, this will take decades and I'm not denying we have pressing problems in the immediate future).

by weatherlite

5/21/2025 at 3:12:23 PM

So the humankind was waiting for AI to bring down all costs to zero, good lord! I thought it was waiting for steam engine, penicillin, railroads, aviation, robotics, computers, nuclear energy, space flight to bring that forth!

by GenshoTikamura

5/21/2025 at 6:19:33 AM

> We _could_ use this to empower humans, but many of us instinctively know that it will instead be used to crush the human spirit. The end result of this isn’t going to be an expansion of creative ability, it’s going to be the destruction of creative jobs and the capture of these creative mediums by a few large companies.

The same was said about the camera or photoshop.

by sekai

5/21/2025 at 3:16:45 PM

The kind of argument which boils down to "the death of the human spirit is imminent, because it is never OK to stop where we are, and only a step forward is possible, because there are plenty steps already taken behind"

by GenshoTikamura

5/21/2025 at 8:32:26 PM

you act as if the human population has no agency to choose what they want? This will be another tool for good and bad. People will make beautiful things the world hasn't seen before, and others will use it for propaganda. just like all things we touch

by dsadfjasdf

5/20/2025 at 6:59:09 PM

I’m a creative and I’m really glad that more people can express themselves

Just wanted to add representation to that feeling

by yieldcrv

5/20/2025 at 7:10:53 PM

Expressing themselves by generating boilerplate content?

Creativity is a conversation with yourself and God. Stripping away the struggle that comes with creativity defeats the entire purpose. Making it easier to make content is good for capital, but no one will ever get fulfillment out of prompting an AI and settling with the result.

by lilwobbles

5/20/2025 at 8:42:28 PM

Check out all the creatives on /r/screenwriting, half the time they are trying to figure out how to "make connections" just to get a story considered. It's a fucking nightmare out there. Whatever god is providing us with AI is the greatest gift I could imagine to a creative.

by ivape

5/20/2025 at 10:26:49 PM

AI could be useful if used like any other tool, but not as an all in box where everything is done for you minus the prompt. Im actually worried people will become lazy

by onemoresoop

5/21/2025 at 8:54:38 AM

People will always become lazy with new tools. But not all people.

It'll lower the barrier of entry (and therefore the quality floor before people feel comfortable sharing something "they made" if they can deflect with an easy "the AI made this" versus "I put XY0 hours into this"), but it'll also empower people who wouldn't otherwise even try to create new things and, presumably, follow their passion to learn and do more.

by drusepth

5/21/2025 at 3:35:50 PM

Im sure not all people will become lazy but im worried about the trend in general.

by onemoresoop

5/21/2025 at 11:23:45 PM

Creativity is an expression. It comes from the heart. Hard work isn't always the greatest vehicle for creativity. We just think it is. I've seen plenty of things that clearly took a lot of execution but fundamentally lack creativity, often becoming an exhibition in technical virtuosity.

Here's something you can try to prove it to yourself. Sit down and write a novel. It'll be like squeezing blood out of a rock unless your heart is ready to do it freely. You'll see that if you force yourself through hard work to do it, you'll just end up with something that people will laud as creative due to the execution but it'll lack everything about free-flowing creativity. Good programmers are lazy, so are good creatives, but now I'm just repeating myself.

It's a lot easier squeezing blood out of a heart, especially for the lazy.

by ivape

5/20/2025 at 9:38:32 PM

exactly, creatives and everyone else can always do something fulfilling for themselves just like before AI. They can struggle all they want and continue doing it for no capital. because that process is fulfilling to them.

by yieldcrv

5/20/2025 at 10:24:19 PM

Yes but how many will sign up for that? Im sure few will continue to do so but creativity will certainly take a big hit.

by onemoresoop

5/21/2025 at 1:37:14 AM

> Yes but how many will sign up for that? Im sure few will continue to do so but crea

It’s not important to me that they do.

> Im sure few will continue to do so but creativity will certainly take a big hit.

I’ve seen the workflows for AI generated films like https://youtu.be/x6aERZWaarM?si=J2VHYAHLL3og32Ix and I find it to be very creative. Its more interesting to me that this person would never have raised capital and tried to direct this, but this is much closer to what they wanted to create. I’m also entertained by it, whether I was judging it for generative AI issues or not.

by yieldcrv

5/20/2025 at 7:20:26 PM

They always could express themselves.

by StefanBatory

5/20/2025 at 9:36:52 PM

Not close to the way they wanted, and at too much sacrifice to the other things they were interested in or supported their family with

by yieldcrv

5/21/2025 at 2:08:46 AM

so they where never interested in the first place... but now they can call them self's artists after prompting a AI to make a image....

by emkoemko

5/21/2025 at 3:06:47 AM

the level of discipline needed in a trade has gone down in almost every trade, including all mediums of art, for centuries

its not really anyone's problem, and generally limited to the people that made way too much of their identity to be based on a single field, that they feel they have to gatekeep it

its great that people can express themselves closer to their vision now

by yieldcrv

5/20/2025 at 7:04:17 PM

Do you find it sad that people can use recordings, and don't have to hire musicians any more?

by toenail

5/21/2025 at 1:36:36 PM

That's step 1, we're at step 100, it looks like that now:

https://www.youtube.com/shorts/rtxJ0t8Cf6g

https://www.youtube.com/shorts/wjaSPHRNfjQ

by lm28469

5/20/2025 at 7:20:14 PM

Recordings of who? Not only sad but a disaster, I'm sorry but anyone that ever tried to play an instrument seriously knows how much human touch/imperfections come into play, otherwise you're just an anonymous guy playing in a cover band(like the ai will do)

by mirkodrummer

5/20/2025 at 10:19:45 PM

A tsunami of effortless content is upon us and that will change many things including tastes, probably for the worse. People not having to learn instruments because same can be done with a prompt is a tragic loss for humanity, not because human work is better but because of the lost experience and joy of learning, connection with the self and others and so many other things.

by onemoresoop

5/21/2025 at 3:29:52 AM

> because of the lost experience and joy of learning, connection with the self and others and so many other things.

Nothing prevents human from continue doing just that, precisely because it brings joy and satisfaction. Painting, photography classes are still popular, if not more, in the age of digital photography.

by jryle70

5/20/2025 at 9:41:29 PM

Robotics will come in the next few years. If you believe the AI2027 guys, though, the majority of work will be automated in the next 10 years, which seems more and more plausible to me every day.

by BosunoB

5/21/2025 at 4:18:04 AM

Are you independently wealthy enough to benefit from that or someone who should invest in suicide pills for themselves and their family if that day comes?

by hooverd

5/21/2025 at 2:59:02 PM

Some people haven't overcome their childhood desire to die or suffer so as to make the parents regret their decision not to buy that candy or take the puppy home. They imagine dismal future as a glorified way to suffer. Talk about cyberpunk - there's that sweet alluring promise to spend a whole life eating instant ramen sitting next to a window with a blinking neon sign and endless rain behind it, coding routinely to a lofi soundtrack, or lurking lonesomely about the techno-slum neighbourhood hiding their faces from CCTV behind masks

by GenshoTikamura

5/21/2025 at 3:24:32 PM

You know, I'd really prefer not to, but I have eyes and object permanence. Maybe we'll get the Iain Banks Culture future!

by hooverd

5/22/2025 at 3:11:24 PM

Why invest in weaksauce suicide pills when you could instead invest in nitrogen compounds and suicide bomb the tallest nearby building? Just because you've already lost doesn't mean they get to win, let alone survive.

by BeFlatXIII

5/22/2025 at 4:02:29 PM

Maybe we'll get the Iain Banks Culture future!

by hooverd

5/21/2025 at 5:07:20 PM

Multimodal LLMs are currently the natural research step towards AGI robots that can do mundane non-creative work. I believe this is just the reality of the situation. If you can generate a video of a robot doing the dishes then your model understands the physical world quite well. That should be useful for robot control.

by golol

5/20/2025 at 8:09:32 PM

Plenty of non-creative work can be automated.

Have a look at the workflow and agent design patterns in this video by youtuber Nate Herk when he talks about planning the architecture:

https://m.youtube.com/watch?v=Nj9yzBp14EM

There’s less talk about automating non-creative work because it’s not flashy. But I can promise it’s a ton of fun, and you can co-design these automations with an LLM.

by cadamsdotcom

5/20/2025 at 9:58:58 PM

This kind of tech will open up filmmaking to a much wider base of creative talent.

by ugh123

5/21/2025 at 9:53:59 AM

What is non-creative work? I think the term reeks of elitism. Every job is creative, even picking up garbage can become an art when one puts effort in it.

There is a more sensical distinction between work that is informational in nature, and work that is physical and requires heavy tools in hard-to-reach places. That's hard to do for big tech, because making tests with heavy machinery is hard and time consuming

by seydor

5/21/2025 at 11:06:03 AM

> even picking up garbage can become an art when one puts effort in it.

good lord. talk about pedantic.

by jb1991

5/20/2025 at 8:37:05 PM

Dude.

Making a movie is not accessible to most people and it's EVERYONES dream. This is not even there yet, but I have a few movies I need to make and I will never get a cast together and go do it before I die. If some creatives need to take a backseat so a million more creatives can get a chance, then so be it.

by ivape

5/20/2025 at 10:10:07 PM

Yeah, there will be so many AI generated videos that many will go unwatched. Not sure where this is heading but it's certainly an interesting future.

by onemoresoop

5/20/2025 at 10:52:22 PM

Nobody will watch anyone's "creations". The TV or whatever device watch on will observe what you and everyone else engage with and how you interact with it and create content for you on the fly, just like Instagram and TikTok's feeds do now.

AI creatives can enjoy the brief blip in time where they might get someone else to watch what they've created before their skills become obsolete in an exponentially faster rate just like everyone else's.

by briankelly

5/20/2025 at 11:39:40 PM

Then everyone can just get their own personal movies and infinite content stream. Honestly people would probably like that given how atomized society has become.

by jackphilson

5/20/2025 at 11:13:32 PM

Sounds terrifying to me.

by grugagag

5/20/2025 at 7:19:47 PM

It makes me sad that the US and western Europe which have been the most flexible and forward-thinking societies in the world for generations have now memed themselves into fretting and hand-wringing about technical advances that are really awesome. And for what? The belief that illustration and filmmaking which have always been hobbies for the vast majority of participants should be some kind of jobs program?

by woah

5/20/2025 at 8:18:05 PM

People aren't looking forward to companies playing the "how much sawdust can you put in a rice crispy before people notice the difference" experiment on the entertainment industry. The quality of acting, scripting, lighting, and animation in the film/television industry already feels second rate to stuff being made before 2020. The cost cutting and gutting of cultural products is becoming ridiculous, and this technology will only be an accelerant.

by dmonitor

5/20/2025 at 8:36:14 PM

If you don't like a movie, then don't watch it.

by woah

5/21/2025 at 11:14:17 AM

I tried Whisk to generate images which I then animated, thinking it would be using the newest model. But then I noticed that Veo 3 and Imagegen 4 are only usable through Flow, and only if you're on the most expensive plan. AI Studio also only shows Imagegen3 and Veo2 as media generating options.

My main issue when trying out Veo 2 was that it felt very static. A couple elements or details were animated, but it felt unnatural that most elements remained static. The Veo 3 demos lack any examples where various elements are animated into doing different things in the same shot, which suggests that it's not possible. Some of the example videos that I've seen are neat, but a tech demo isn't a product.

It would be really cool if Google contracted a bunch of artists / directors to spend like a week trying to make a couple videos or short movies to really showcase the product's functionality. I imagine that they don't do that because it would make the seams and limitations of their models a bit too apparent.

Finally, I have to complaint that Flow claims to not be available in Puerto Rico: "Flow is not available in your country yet." Despite being a US territory and being US citizens.

by TheAceOfHearts

5/21/2025 at 3:13:33 PM

You can use imagen 4 in vertex ai. But no Veo 3.

Also Google is going to have to tread carefully, people in the entertainment industry are already AI hostile, and they dictate a surprising amount of public opinion.

by Workaccount2

5/21/2025 at 8:26:19 PM

You can use veo 2 for free in the google ai dashboard. like 5 a day

by dsadfjasdf

5/20/2025 at 6:48:04 PM

Has anyone actually tried Veo3 and know if it’s as good as this looks?

The demo videos for Sora look amazing but using it is substantially more frustrating and hit and miss.

by jonplackett

5/21/2025 at 2:34:35 AM

Here is a twitter user that is posting videos generated with Veo3 (watch unmuted):

https://x.com/fofrAI

by gpt5

5/21/2025 at 6:18:55 AM

I can definitely see this being used for lower end advertising

I’ve noticed ads with AI voices already, but having it lip synced with someone talking in a video really sells it more

by arduinomancer

5/21/2025 at 12:42:49 AM

Basic principles:

1. People like to be entertained.

2. NeuralViz demonstrates AI videos (with a lot of human massaging) can be entertaining

To me the fundamental question is- "will AI make videos that are entertaining without human massaging?"

This is similar to the idea of "will AI make apps that are useful without human massaging"

Or "will AI create ideas that are influential without human massaging"

By "no human massaging", I mean completely autonomous. The only prompt being "Create".

I am unaware of any idea, app or video to date that has been influential, useful or entertaining without human massaging.

That doesn't mean it can't happen. It's fundamentally a technical question.

Right now AI is trained on human collected data. So, technically, It's hard for me to imagine it can diverge significantly from what's already been done.

I'm willing to be proven wrong.

The Christian in me tells me that Humans are able to diverge significantly from what's already been done because each of us are imbibed with a divine spirit that AI does not have.

But maybe AI could have some other property that allows it to diverge from its training data.

by cynicalpeace

5/20/2025 at 7:47:36 PM

>>models create, empowering artists to bring their creative vision

Interesting logic the new era brings: something else creates, and you only "bring your vision to life", but what it means is left for readers questioning, your "vision" here is your text prompt?

Were at a crossroads where the tools are powerful enough to make the process optional.

That raises uncomfortable questions: if you don’t have to create anymore, will people still value the journey? Will vision alone be enough? What's the creative purpose in life? To create, or to to bring creative vision to life? Isn't the act of creation is being subtly redefined?

by gloosx

5/20/2025 at 8:06:04 PM

It's being redefined in such a way that 2-3 very large entities get to hold the means of production. It's a very convenient redefinition for them.

by dmonitor

5/20/2025 at 8:48:39 PM

[flagged]

by teitoklien

5/20/2025 at 9:34:38 PM

> I make songs daily

Sorry for being blunt, but you do not. You receive some music matching your request from a service offered by an entity which aims to control as much of content creation and distribution as possible, up to total monopolization.

> I’m not a musician... Now I can with $10-$15 / month.

If you want to create music, do it, it doesn't require much money. If you just want to listen, there's literally thousands of authors creating all kinds of authentic, sincere, daring, skillfully performed, carefully mixed music, giving it away for next to nothing and still striving to find their listeners.

What you pay for is avoiding the effort of finding what suits you.

by ZoomZoomZoom

5/20/2025 at 9:51:07 PM

No amigo, no artist out there made a japanese song on my anime obsessed schoolteacher friend and his life.

When he received that song from me, he was super excited for next 3-5 days and still listens to it and flexes them to their friends.

Same thing happened with a lot of my other friends and me too, I have an apple shortcut script that generates songs for me daily based on my routine for that day pulled from todoist.com

I still listen to and pay for others music and songs, but this experience with AI is entirely different.

What I pay for is not avoiding effort of finding what suits me but creating what suits me.

> You receive some music matching your request from a service offered by an entity which aims to control as much of content creation and distribution as possible, up to total monopolization.

What I receive is a high fidelity song made from prompts that i’m given full ownership of , when pooled at scale between all users allows people to make their own song generators with GPUs.

Its very nature is the opposite of monopoly. I’d love to hear how you think the big 3 corps (Universal, Sony, etc) who own all the music almost globally are not a monopoly ? Never had an experience where your spotify or apple music streaming albums disappear randomly due to those big 3 corps ?

My friend’s song of himself will never disappear that mp3 file he can store on a pen drive, load ig anywhere, gift to anyone. How is that the “monopoly” ?

by teitoklien

5/20/2025 at 10:08:45 PM

> when pooled at scale between all users allows people to make their own song generators with GPUs.

I can assure you, local generation is to become a fringe activity, same as self-hosting web services, only worse, because the quality gap (which in case of software is often negligible) will be insurmountable.

> I’d love to hear how you think the big 3 corps (Universal, Sony, etc) who own all the music almost globally are not a monopoly?

It's not a monopoly, it's a cartel. Luckily, they don't own everything, though, too much they do.

> no artist out there made a japanese song on my anime obsessed schoolteacher friend and his life.

Ok, what you describe is commissioning. Yeah, you can't argue with the fact it now can be done almost free and is becoming good enough for most, but you have to keep in mind, this process had been feeding a considerable amount of artists who do it to keep producing their art. Cutting this source of income is not wrong per se, but the consequences are the opposite of supporting the diversity and abundance in arts.

by ZoomZoomZoom

5/20/2025 at 10:37:38 PM

> but the consequences are the opposite of supporting the diversity and abundance in arts.

I made 5 songs about 5 different people in a week, with carefully crafted lyrics and tones described by me in the custom prompt, that led to 10 mp3 files of songs (suno generates 2 songs per prompt) Those songs are out there, it’s different, it’s not sloppy it’s actually quite enjoyable.

Now there is more diversity and abundance those songs wouldn’t have existed without AI and there are millions doing it like me out there, those artists who produce songs also have same tools as me, they can be better than me, faster, better, more albums now made by them, edit stuff to perfection, ideate and iterate faster. Who is stopping them ?

Tell me this song is trash and slop : https://suno.com/song/c36741d6-ec62-4922-86f9-6fd0b6f37497

This is in replacement of me and my friends listening to their same 20-40 artists who would be in billboards list each month.

Tell me it has hurt the abundance and diversity of songs out there, that it stopped someone from making their own thing or others listening to their songs, I listen to that song, it’s made by someone else with AI, I don’t mind, it’s awesome !

by teitoklien

5/21/2025 at 4:45:02 AM

You didn't make anything, aside from whatever lyrics, if any, you wrote yourself.

The song is adequate. The beat, drop, and chords are not that complicated, you could've learned how to do that yourself and not be chained to suno or whatever.

I would be delighted if a friend took the time to personally write and produce a song about me. I would not care at all if I realized it was just auto-generated stuff being pumped out "5x per week". The former I would cherish, the latter would be disposable junk to be listened to once and then forgotten about.

It's slop because it doesn't mean anything or has any value, not because it's literally sloppy.

---

"listening to their same 20-40 artists who would be in billboards list each month"

That's a you problem, quite literally a skill issue. There are so many indie artists making wonderful music that's leagues better than whatever hyper-distilled swill is on the radio at the moment.

---

"tell me it has hurt the abundance and diversity of songs out there"

Actual artists who make music composition and production their career ARE hurt when they can't find jobs/gigs/commissions because of things like suno. These tools are predicated on wholesale theft of music produced by humans.

But you can generate disposable songs at breakneck pace that don't mean anything, so it doesn't matter.

by huimang

5/20/2025 at 10:57:30 PM

What that service generates for you is content, it's not art. I don't understand how you can mix one for the other. The only artistic creative process happening is you making a prompt and reiterating. You're free to disagree, but I posit it's not nearly enough to qualify.

> Tell me this song is trash and slop

As you wish.

What I argue is not about "diversity of songs", but that's this "AI" revolution strives to further the process of replacing the arts with commodity substitutes. It will be sleek, shiny, comforting, friendly and, possibly, enjoyable, but it's almost worthless in itself.

by ZoomZoomZoom

5/21/2025 at 7:55:50 AM

I'm sorry but that song you posted is the most generic, bland garbage that imitates the literally millions of other pop songs they stole in their training sets. It sounds like Charlie xcx without any of the actual creativity, and the actual track itself you could learn how to make yourself with FL Studio in about half an hour following one of the many tutorials out there.

by sensanaty

5/20/2025 at 9:11:02 PM

If your focus is to solve the problem, then it makes sense to treat the process as secondary. The tools are just means to an end.

This view also aligns with how generative AI is marketed – it's a way to accelerate realization, not a way to focus on the act of crafting.

That said, outcome-first thinking does run the risk of disconnection, and our current culture is all about disconnection.

by gloosx

5/20/2025 at 9:23:04 PM

Even the process is more democratized now, Want to learn coding ?

Build out your app idea first with replit -> Then export the codebase into your computer -> Run claude code on it and ask it to scan all the files and describe the tech stack to you and how it operates while giving you all the major components you need to learn to understand it with youtube channel and book recommendations for each topic + work exercises -> Use perplexity deep research once a week to further research every topic as you start to learn them

If you’re a busy man/woman make gumloop or lindyai workflow to check your calendar and pack in timeslots to do all of this learning, and then auto send you worksheets via email as homework to test you skills

All of this for a price of 1/15th of a college degree (not even an expensive college)

This is not hypothetical conjecture I do this daily.

So everyone has now 1) Low cost access to build stuff with one prompt to realise the value of tools 2) A personal tutor that can then help you scour the depths of the craft and force you to practice and learn deeply now with your added motivation of knowing what’s possible with building stuff

So it has the potential to connect us more too, it’s upto humans to choose whether they do at the end tho. That is their liberty.

by teitoklien

5/20/2025 at 9:33:06 PM

This raises another interesting question to think about: if everyone has (1) low-cost access to build anything, and (2) low-cost access to learn how to build anything at the same time...

...what do you think a human would choose the most?

by gloosx

5/20/2025 at 9:42:55 PM

Hence said is their liberty, Im not contesting what you’re worried about. That idiocracy will rise and our nextgen will only have surface level thoughts with AI being more and more being the subtle decider of every human’s choice in the background.

That is probably the more likelier possibility. However it just shows the lack of philosophy in our modern times, people don’t do things they are lazy about and a Choice between the easy way and hard way is no longer a choice for majority the easy way’s dangling carrot is the final ultimatum.

I think i’ll leave it at the thought that as time progresses to find value in day to day life, to force ourselves to choose the right thing, philosophy will again have to become a much stronger actor in our lives, or else we’d all drive off a cliff at current rate.

At the end what happens will be decided by choice and liberty of humans as their choices expand.

by teitoklien

5/20/2025 at 9:19:24 PM

That's step one, and you are right.

Step two is... sure, every pleb can now create art.

That devalues art. More than that, that makes for a "winner takes all" marketplace. So even fewer people than now benefit from it. More than that, guess who wins out: middlemen, the marketplace owners.

Read the Black Swan by Nassim Taleb, especially the chapters about Extremistan and Mediocristan. Basically every time we invent something that scales and unlocks something for a great amount of people, we commoditize it and the quality of life for the average person in that field goes down while the leeches, pardon me, the middle men, are the only ones that become constantly rich, after the initial struggle to achieve market dominance (so when the market matures).

by oblio

5/20/2025 at 9:27:24 PM

Tell that to the kings in medieval times who were the only ones who could afford music from finest crafts men and silk shirts

This allowed people in those crafts to be very rich.

Then came the evil leeches of middle men who brought cheap fine clothes for the masses and music in the hands of every broke college kid in his dorm room. So evil !

Get the sarcasm ? Calling ability for masses to do more things is somehow horrible is elitist, calling the people who make that possible leeches is just elitism in velvet glove.

As long as the majority gets more value in their daily life, world is better.

by teitoklien

5/20/2025 at 10:38:00 PM

>Calling ability for masses to do more things is somehow horrible is elitist

What are they doing other than consuming with extra bells and whistles tacked on to it? I'm sorry, but it's not art.

by jplusequalt

5/21/2025 at 10:06:48 AM

The problem, as usual, is one of scale. Quantity has a quality all its own, and all that.

We maybe needed to go from 1 artist in 1 million people to 1 artist in 10 000 people. We do *not* need to go from 1 artist in 10 people.

And regarding "cheap fine clothes", that was sort of fine.

But what about when cars drive themselves (16.2 million persons - 10.3% of the U.S. labor force works in transportation)?

AND

What about when software mostly writes itself (4,4 million persons)?

AND

What about when stuff mostly manufactures itself (13 million persons)?

AND

... ?

(repeat ad-nauseam for every field upended right now by AI)

If you think MAGA was bad (MAGA being mostly caused by manufacturing jobs being moved away/automated) what do you think will happen when 40% of the population is unemployed and unemployable, forever?

by oblio

5/20/2025 at 9:26:17 PM

> but what it means is left for readers questioning, your "vision" here is your text prompt?

Right. Imo you have to be imagination handicapped to think that creative vision can be distilled to a prompt, let alone be the medium a creative vision lives in its natural medium. The exact relation between vision, artifact, process and art itself can be philosophically debated endlessly, but, to think artifacts are the only meaningful substrate at which art exists sounds like an dull and hollowed-out existence, like a Plato’s cave level confusion about what is the true meaning vs the representation. Or in a (horrible) analogy for my fellow programmers, confusing pointers to data with the data itself.

by klabb3

5/20/2025 at 8:43:17 PM

LLM providers want to a) make you dependent on their services as you outsource your skills and cognition and b) use that dependency to skim the cream off every economic activity.

by hooverd

5/21/2025 at 8:39:14 AM

> b) use that dependency to skim the cream off every economic activity.

Exactly. Probably the most important quote of modern times is, I think it was a CEO of an ISP that said it: "we don't want to be the dumb pipes" (during a comparison with a water utility company).

Everyone wants to seek rents for recurring revenue someone else actually generates.

by oblio

5/20/2025 at 9:22:43 PM

we can see what happened to opera/theater/hand drawn art as conclusive answer. humans move on to the newer more easier to create/consume thing in general (digital music/tv/digital art) and a small percentage of people treat the older mode of creation as high art coz it's more difficult and expensive to learn / implement.

by kkarakk

5/20/2025 at 9:45:11 PM

Calling cinema a "new form" of theatre is quite a simplification. It was certainly inspired by theatre, but the two differ in almost every aspect: medium, communication language, cultural role, and audience dynamics. Most people throughout history probably never experienced theatre or opera – so they didn’t move from them to cinema; rather, cinema emerged as a more accessible and reproducible medium for those.

Theatre and opera are regarded as high art because they are performed live in front of an audience every time, demanding presence, skill, and immediacy – unlike cinema, which relies on a recorded and edited performance.

by gloosx

5/21/2025 at 12:49:32 PM

Umm no. GTA 6 will cost on the order of 2bn to make, more than some wars. You will likely play it. Your kids definitely will.

What is true is that cheapening the cost of creative production will yield a wider variety of expression: we will see what people prefer to consume.

by vessenes

5/20/2025 at 8:23:14 PM

I've been doing AI art since 2022 and I'm still both disappointed and not quite surprised that this still is a pervasive view of what it takes to create anything high quality using AI.

If you take any high quality AI content and ask their creator what their workflow is, you'll quickly discover that the complexity and nuance required to actually create something high-quality and something that actually "fulfills your vision" is incredibly complex.

Whether you measure quality through social media metrics, reach, or artistic metrics, like novelty or nuance, high quality content and art requires a good amount of skill and effort, regardless of the tool.

Standard reading for context: https://archive.org/details/Bazin_Andre_The_Ontology_of_Phot...

by lxe

5/20/2025 at 8:38:26 PM

>If you take any high quality AI content and ask their creator what their workflow is, you'll quickly discover that the complexity and nuance required to actually create something high-quality and something that actually "fulfills your vision" is incredibly complex

This comes off as so tone deaf seeing your AI artwork is only possible due to the millions of hours spent by real people who created the art used to train these models. Maybe it's easier to understand why people don't respect AI "artists" with this in mind.

by jplusequalt

5/21/2025 at 12:46:47 PM

I also feel unhappy when painters get formal art training which summarizes millions of human hours of work. Even worse they then go to Florence and waste their time stealing art by painting..exact copies of other people’s paintings!

I feel a real artist would make their own tools: brushes, paint, canvas, and above all be truly creative by not unfairly using anything that’s gone before. If they did they aren’t creative; they’re a thief.

by vessenes

5/21/2025 at 3:36:31 PM

You said you're an AI artist, but you've just dismissed all artists upon whose work you build your own art as thieves, so either you've admitted yourself a thief as well or you got some serious trouble with logic

by GenshoTikamura

5/21/2025 at 6:17:01 PM

Sorry, I think you missed the sarcasm in my original post. I am saying that gatekeeping what art is, and what theft is, typically ignores exactly how art is made, how artists are trained, and the history of tools impacting creative endeavors -- basically close to what you say here: there are logical and historical errors that invalidate these complaints in my mind.

p.s. def not an artist.

by vessenes

5/21/2025 at 4:15:24 PM

Ok, there is fundamentally a difference in creation versus consumption here though.

by hooverd

5/21/2025 at 6:00:21 PM

Spending thousands of hours honing a craft and along the way learning from those who came before you, is not the same as using these models. You're being overly reductive to try and force a similarity, but in the process you just come off as disingenuous.

by jplusequalt

5/21/2025 at 6:14:47 PM

No. You're gatekeeping what "craft" is.

Is craft making your own chisels for woodworking?

Perhaps there are craftsman who buy chisels made by others.

Okay. Then is craft only making furniture with dovetail joints by hand?

Well, I guess people use planers.

So, no it's not just hand made wood working that's craft.

Someone uses a CnC machine with a design they made to cut wood, then hand sands and polishes. Is that craft?

What if you learned it took them three or four times as many hours to learn the CnC machine and design as it did to hand plane a cedar log?

To be clear, I don't identify as an artist at all, but I do have a stake in this conversation -- which is that I'd like more young folks to be positive, pick up tools at their disposal and build good things with them. The future's coming, and it's going to be built out by people with open minds who are soaking up everything they can about whatever tools are available. It's a sort of brain rot to gatekeep technology advances out of creativity.

by vessenes

5/21/2025 at 7:34:38 PM

>which is that I'd like more young folks to be positive, pick up tools at their disposal and build good things with them

People have had access to tools for creating for generations. In the modern era you can buy a pencil and a sketch pad for dollars. You can buy an instrument used for as little as a hundred dollars. Hell, schools teach art and music for free.

>The future's coming, and it's going to be built out by people with open minds who are soaking up everything they can about whatever tools are available.

Not all technology presents a net good for society. These technologies only exist on top the mountain of stolen artwork created by millions of artists, and this tech will continue to hamper the livelihoods of artists as long these companies are pushing them.

>It's a sort of brain rot to gatekeep technology advances out of creativity.

JFC. Don't talk to me about brain rot. The "art" and "creativity" you speak about here is just more finely grained consumption. Now instead of scrolling through a feed, you can ask Google to present your dopamine addicted brain exactly what you want to see in that moment.

In contrast, focusing on improving a craft acts as a sort of antidote to "brain rot" because you're engaging in multiple important things at once:

- critical thinking

- delayed gratification

- habit formation

- emotional exploration

- and more

by jplusequalt

5/21/2025 at 9:08:05 PM

I agree on the benefits of a hand-type craft, and that it’s an antidote to brain rot. Totally with you.

I agree with the idea of “Amistics” (thanks Neal) - a sort of societal and moral lens to view technologies through and evaluate them. Totally with you there too.

I agree that doomscrolling and social media are cancer-y in the extreme, to the extent that for a number of years I printed a daily personal newspaper. Srsly.

> this tech will continue to hamper the livelihoods of artists …

Nope. We’ll just redefine what an artist is. Pop quiz: did Disney employ more “artists” when each cel of a film was hand drawn and colored, or now when these modern “faux-artists, not like the real ones” have access to rendering clusters?

Or a second pop quiz, when da Vinci or Rubens ran workshops where apprentices painted “da Vincis” or “Rubens(s?)” who was the artist?

By the way, it’s right to redefine what an artist is. I’m going to get super controversial, ca 1900 and say that photographers can be artists. Now I’m going to get super controversial ca 1910 and say that someone mounting a bicycle wheel as a ‘readymade’ and displaying it can be an artist. Wait, now I’m going to move ahead the 1980s and say a cow cut in half and suspended in some sort of formaldehyde can be art. Hang on. A poem on a disk that deletes itself as its read is art.

The art is the creative endeavor itself. It’s the outcome of a creative person engaging with whatever tools they want to create some output. If someone wants to engage with an LLM or diffusion model or whatever and have it make something to those standards, it’s art. Calling them ‘not an artist’ based on their choice of tools is just totally incorrect.

I’m not saying all uses of diffusion models or any other AI assisted imagery is art. But I am saying that ingesting and summarizing publicized images is not theft, and people choosing to use those tools to instantiate a creative vision can absolutely be art, and further that generally the cheaper a form of creative expression becomes the better on balance for the world.

by vessenes

5/21/2025 at 9:50:53 PM

>Calling them ‘not an artist’ based on their choice of tools is just totally incorrect

Here's the crux of the issue I have with this entire conversation--because you now are able to generate "artwork," you expect the artistic community to respect you as an artist. You're waltzing into the room with none of the same battle scars, experiences, or morals and demanding that they bestow upon you the title of "creator".

>By the way, it’s right to redefine what an artist is

Sure, but let artists be the ones who take charge in redefining what art is. How is it right to redefine what "art" and "creating" is without the goodwill or consent of the artistic community at large? You are effectively trying to force a hostile takeover of the space, to demand everyone consider your generated image/song/video be treated with the same amount of respect as actual art.

If you can't even be bothered to respect the artistic community enough to understand why they feel slighted over the creation of these tools, or to empathize with them over their impact in livelihood due to the proliferation of AI slop, why the hell do you expect them to consider you an artist?

by jplusequalt

5/22/2025 at 1:28:37 AM

Some ad hominem here; and a bunch of goalpost shifting. I’m not claiming to be an artist. Are you, by the way?

If you look through civitai and the stable diffusion subreddit you’ll find people who’ve spent thousands of hours tuning these AI tools to produce something that they imagined. In my mind, they’re artists. It might be bad art, some of it is, some of it is arguably not, but they fit the description to me. They certainly think of themselves as struggling to create things they envisioned, and sometimes achieving it.

As to who gets to define art and what art is: please understand that I’m saying —>> you are gatekeeping <<— by calling people who spend thousands of hours creating imagery they want to create “not part of the artistic community”.

So, I have a broader view of the artistic community than you, full stop. It includes people whose livelihood is going to be disrupted by this technology. It includes a bunch of people who couldn’t create imagery they imagined before but can now.

Just as I can understand why Luddites burned shit in Northern England, I can understand and even respect a fight from interest groups to turn back the clock on new technology. And I am interested to see how strong guilds like SAG navigate and negotiate new economies around creating.

End of the day - I think moralizing in order to limit human creativity with bullshit made up rules about what an artist “is” or should be is foolish, wrong-headed, and ultimately doomed as an endeavor, plus it runs the risk of convincing new creatives not to engage. It’s a net loss for human creative output, while advocates get to pearl-clutch about the evils of tech. It’s just the wrong, wrong, wrong attitude to have about it; probably a waste of time trying to convince one well-spoken person on HN to change their views. But, hopefully you will. You could still rail against the tech by the way, or advocate for protectionism or a bunch of stuff, even if you decided to accept a person could use a diffusion model to make something creative.

by vessenes

5/20/2025 at 8:37:43 PM

Given the pervasiveness of AI slop with hundreds of thousands of likes on Facebook, I wouldn’t be so sure about using social media metrics as proof of high skill and effort.

by kevinventullo

5/20/2025 at 8:30:42 PM

Text prompts are very short now, but that can quickly change if prompt following improves.

Software Engineers bring their vision to life through the source code they input to produce software, systems, video games, ...

by tintor

5/20/2025 at 7:36:05 PM

>Imagen 4 is available today in the Gemini app, Whisk, Vertex AI and across Slides, Vids, Docs and more in Workspace.

I'm always hesitant with rollouts like this. If I go to one of these, there's no indication which Imagen version I'm getting results from. If I get an output that's underwhelming, how do I know whether it's the new model or if the rollout hasn't reached me yet?

by Imnimo

5/20/2025 at 11:20:31 PM

Indeed. At the bottom of their Imagen page, they link to Google AI Studio:

https://aistudio.google.com/generate-image

But this still says it's Imagen 3.0-002, not Imagen 4.

by cubefox

5/21/2025 at 10:17:59 AM

Yes, Google is so, so, so bad at this. I even struggle with gemini often telling me it can't make images, until I tell it that it can, and then it does. I have no idea what's really supposed to be supported or not in gemini.

It is so confusing. Ok, I got gemini pro through workspace or something, but not everything is there? Sure, I can try aistudio, flow, veo, gemini etc to figure out what I can do where, but so bad UX. Just tried using gemini to create an image, definitely not the newest imagegen as the text was just marbled up. But I can't see which version I'm on, genious.

Edit: After clicking through lots of google products I'm still not able to find a single place I can actually try the new imagegen, despite the article claiming it's available today in X,Y,Z

by matsemann

5/20/2025 at 8:00:51 PM

Google is typically upfront about which model versions you're using in those tools. Not as behind-the-scenes as ChatGPT.

However, looking at the UI/UX in Google Docs, it's less transparent.

by minimaxir

5/21/2025 at 7:54:37 AM

Older people on social networks are cooked. I mean in general, we are entering an age where making scams and spreading false news will be easily done with 10$ of credits.

by ssijak

5/21/2025 at 1:31:19 PM

Yeah i fear that too, my grandma is already sending me links of AI animals that she thinks is real, and the horrible/beautiful art of facebook memes/holiday cards, seems to be completely overtaken by AI. We know that full fake video of you with your own voice asking for something or even interacting on a video call is basically solved problem. Prime time to reestablish and confirm trusted channels with the people you care about.

by asl2D

5/21/2025 at 3:25:03 PM

Recently at a family dinner we established that any kind of unsolicited contact that falls outside typical conversation - Asking for money, sending money, pretty much anything with money - you must say the word that only people in our family would know.

Ironically, this would be a good application of AI, where the AI listens in on their calls, and will flag conversation that warrants the keyword being said.

by Workaccount2

5/21/2025 at 4:09:25 PM

Funny but also illustrative issue:

in the owl/badger video, the owl should fly silently.

This is an interesting non-trivial problem of generalization and world-knowledge etc., but also?

There's something somewhat sad about that slipping through; it makes me think, *no one involve in the production of this video, its selection, it passing review... etc., seemed to realize that it is one of the characteristic things about owls that you don't hear their wings.

We have owls on our hill right now and see them almost every day and regularly seem them fly. It's magic, especially in an urban environment.

by aaroninsf

5/21/2025 at 7:37:32 PM

The silent flight of an owl (BBC):

https://www.youtube.com/watch?v=-WigEGNnuTE

Longer version:

https://www.youtube.com/watch?v=-3ZnrhPtER8

by thangalin

5/20/2025 at 5:58:46 PM

Got a bit of an uncanny valley feeling with the owl and the old man videos. And the origami video give me a sort of sinister feeling, seemed vaguely threatening, agressive.

by elzbardico

5/20/2025 at 10:53:08 PM

We've made so much progress in the last 20 years; it used to take huge teams of developers and artists and giant compute clusters and rendering time to generate uncanny valley!

Now it just takes giant compute clusters and inference time.

by benlivengood

5/20/2025 at 6:48:01 PM

Lower on the page there's a knitted characters version that feels much better. It seems like for some of these, divorcing yourself from reality a little bit helps avoid the uncanny valley.

by jjcm

5/20/2025 at 7:27:01 PM

The owl one had that glow that so many AI images have for some reason. The man was very impressive to me.

by thinkingtoilet

5/20/2025 at 6:30:31 PM

It's a reflection of yourself.

Origami for me was more audio than video. Felt like it's exactly how it would sound.

by vjerancrnjak

5/21/2025 at 4:41:50 AM

First test is... very confusing - https://x.com/Seancheno/status/1925049073230372980

by sech8420

5/21/2025 at 11:24:15 AM

Can we talk about the elephant in the room, porn and i mean the weird and dangerous one? that moment in history of AI is going to happen and when it did shit will hit the fan.

by afroboy

5/21/2025 at 3:06:49 PM

AI porn already exist.

Im pretty sure kid/child ai porn already exist somewhere. But i'm quite lucky despite knowing rotten.com and plenty of other sides, never having seen real so i doubt i will see fake child porn.

Whats the elephant in the room now? Nothing changed. Whoever consumes real will consume fake too. FBI/CIA will still try to destroy cp rings.

We could even think it might make this situation somehow better because they might consume purely virtual cp?

by Flamentono2

5/22/2025 at 9:22:38 AM

> Whats the elephant in the room now?

Your family will be target for example, just imagine your daughter in high-school getting bullied by these type of generated AI videos. it's easy to say nothing happen, but when it happen to you you will be aware how fucked is these AI videos.

by afroboy

5/23/2025 at 8:22:01 AM

If someone bullies someone else, they will do it with anything they have.

At least with AI Video you can now always say its AI video.

Is it shitty that this is possible? yes of course. But hidding knowledge never works.

We have to deal with it as adults. We need to educate about it and we need to talk about it.

by Flamentono2

5/21/2025 at 6:06:34 PM

I've already seen headlines about this:

https://arstechnica.com/tech-policy/2025/02/25-arrested-so-f...

by nielsbot

5/22/2025 at 1:20:56 AM

AI-generated porn cannot be “dangerous” for the same reason a dream cannot be dangerous: they are not real, no matter how weird it is

by nomdep

5/21/2025 at 8:37:58 PM

Finally, the single most powerful force on earth comes for child sexual abuse: It'll be much cheaper to use AI than to abuse actual children.

We should all be hoping AI-generated CSAM floods the CSAM market, instead of trying to restrict AI so that we artificially prop the market up and cause harm to many more humans.

by stavros

5/21/2025 at 12:53:01 PM

Do you think anybody will really care? People were generating CSAM basically as soon as image generation become accessible. And for the less dangerous stuff situation is way more rampant already, both in free and commercial way.

by asl2D

5/21/2025 at 1:55:20 PM

Yes. Deepfaked porn is already a widespread mechanism for harassment, both among children and adults. As it gets even easier to create and more and more convincing it will just get worse.

by UncleMeat

5/21/2025 at 1:48:11 PM

You should have AI start writing your comments for you so at least then they'll make sense.

by HamsterDan

5/21/2025 at 5:51:29 AM

A whale coming out of the street in Manhattan, a women with a Jellyfish belly walking in the woods.

Why is it that all these AI concept videos are completely crazy?

by baxtr

5/21/2025 at 10:54:27 AM

There is a point in that since you don't know how these really should look you can't really judge them on small idiosyncrasies, and hence you get a better impression compared to uncanny valley if it's something common.

However, I also think this is to show that it can create anything, not just copies of stuff it has seen. If you ask for a painting of a woman and it shows you mona lisa, that's not very impressive.

by matsemann

5/21/2025 at 10:15:10 AM

If the concept is unrealistic your mind will be more forgiving to unrealisms. But if it's suppose to be photo-realistic, you'll be hyper-critical.

by kypro

5/21/2025 at 6:19:48 AM

I'm going to go out on a limb and say because it's easiest to take whatever comes out looking interesting and sell it as a vibe ?

Like if you asked a model to help you create a coffeeshop website for a demo, it started looking more like sex shop, you just vibe with it and say that's what you wanted in the first place. I've noticed that the success rate of using AI is proportional to much you can gaslight yourself.

by rafaelmn

5/21/2025 at 2:25:53 AM

I came across some online threads sharing LoRA models the other day - and it seemed that a lot of generative AI users seem to share models that are effectively just highly specialized fixed function filters for existing (generated)images?

The obvious aim of these foundational image/movie generation AI developments is for these to become the primary source of values at cost and quality unparalleled by preexisting human experts, while allowing but not necessitating further modifications by now heavily commoditized and devalued ex-professional editors at downstream to allow for their slow deprecation.

But the opposite seem to be happening: better data are still human generated, generators are increasingly human curated, and are used increasingly closer to the tail end of the pipeline instead of head. Which isn't so threatening nor interesting to me, but I do wonder if that's a safe, let alone expected, outcome for those pushing these developments.

Aren't you welding a nozzle onto open can of worms?

by numpad0

5/20/2025 at 6:39:01 PM

I'm surprised no one has yet to mention the use of the name "Flow", which is also the title of the 2025 Oscar winning animated movie, built using Blender. [1]

This naming seems very confusing, as I originally thought there must be some connection. But I don't think there is.

[1] https://news.ycombinator.com/item?id=43237273

by jader201

5/21/2025 at 2:54:49 PM

For sure seems they are likely deliberately riding on the fame of the movie. I too instantly thought it is some kind of Flow movie animation collaboration similarily like Flow is represented in Blender 4.4 splash screen or is even their mascot.

by maldie

5/20/2025 at 7:25:31 PM

Would Google really stoop so low and try to use the success of the movie to prop their AI video generator tool?

But then again, the do no evil motto is long gone, so I guess anything goes now?

by imp0cat

5/20/2025 at 7:46:35 PM

Note that it's very likely that Veo models are based on "Flow Matching" [1]

[1] https://arxiv.org/abs/2210.02747

by lnyan

5/20/2025 at 7:38:28 PM

It's a common word. There are like 50 things named Flow. It's unrelated.

by Legend2440

5/21/2025 at 12:21:04 AM

Before picking a name it's necessary to Google it and make sure you're not squatting on anything important. It's hard to believe that they didn't find they're about to squat on an Oscar winning animated film less than a year after its release. They decided to roll with it anyway, for a tool that basically aims to eliminate animators and filmmakers

by a2128

5/20/2025 at 9:32:34 PM

For anyone with an access, can you ask it to make a pickup truck drive through mud? I’ve tested various different AIs and they all suck with physics and tires spinning wrong way, it is just embarrassing. Demos look amazing, but when it comes to actual use - there is none that worked for me. I guess it is all to increase “investor value”

by cryptoegorophy

5/20/2025 at 11:09:49 PM

Google posted a video of their own of an off-roader going through mud.

https://www.youtube.com/watch?v=SPF4MGL7K5I

Obviously we don't know how hand picked that is so it would be interesting to see a comparison from someone with access.

by roskelld

5/20/2025 at 10:08:27 PM

I think Google's got something going wrong with their usage limits, they're warning I'm about to hit my video limit after I gave two prompts. I have a Google AI Pro subscription (came free for 1 year with a phone) and I logged into Flow and provided exactly 2 prompts. Flow generated 2 videos per prompt, for a total of 4 videos, each ~8 seconds long. I then went to the gemini.google.com interface, selected the "Veo 2" model, and am now being told "You can generate 2 more videos today".

Since Google seems super cagey about what their exact limits actually are, even for paying customers, it's hard to know if that's an error or not. If it's not an error, if it's intentional, I don't understand how that's at all worth $20 a month. I'm literally trying to use your product Google, why won't you let me?

by lelandbatey

5/21/2025 at 4:15:31 AM

Feel free to test imagen 4 on this benchmark: https://github.com/tianshuo/Impossible-AIGC-Benchmark

Ideogram and gpt4o passes only a few, but not all of them.

by tianshuo

5/20/2025 at 8:58:47 PM

Who is doing all the work of making physical agents that can behave as good as a UBI generator? Something that can not just create videos, but go get groceries(hell grow my food), help a construction worker lay down tiling, help a nurse fetch supplies.

https://www.figure.ai/ does not exist yet, at least not for the masses. Why are Meta and Google just building the next coder and not the next robot?

Its because those problem are at the bottom of the economic ladder. But they have the money for it and it would create so much abundance, it would crash the cost of living and free up human labor to imagine and do things more creatively than whatever Veo 4 can ever do.

by itissid

5/20/2025 at 9:45:47 PM

There are companies working on this, but my understanding is that the training data is more challenging to get because it involves reinforcement learning in physical space.

In the forecast of the AI-2027 guys, robotics come after they've already created superintelligent AI, largely just because it's easier to create the relevant data for thinking than for moving in physical space.

by BosunoB

5/20/2025 at 9:00:33 PM

Welcome to the defining paradox of the 21st century:

https://en.wikipedia.org/wiki/Moravec%27s_paradox

by pj_mukh

5/21/2025 at 3:10:24 AM

I think I have a similar distaste for Google as you, but it's just due to limitations in the (bleeding edge...) technology. There's not like a conspiracy to _not_ make a "UBI generator" - which is surely not possible with current technology and won't be for awhile however hard Google might try.

by throwaway314155

5/20/2025 at 9:02:05 PM

Has anyone gotten access to Imagen 4 for image editing, inpaint/outpaint or using reference images yet? That's core to my workflow and their docs just lead to a google form. I've submitted but it feels like it's a bit of a black hole.

by ericskiff

5/21/2025 at 2:05:56 PM

Why do people making AI image tools keep showing "pixel art" made with it when the tools are so obviously bad at making it? it's such a basic unforced error

by ravenical

5/20/2025 at 8:28:10 PM

The first video is problematic? the owl faces forwards then seamlessly turns around - something is very off there.

The guy in the third video looks like a dressed up Ewan McGregor, anyone else see that?

I guess we can welcome even more quality 5 second clips for Shorts and Instagram

by curvaturearth

5/20/2025 at 5:54:21 PM

The ad for Flow would be much better if they laid off the swirly and wavy effects, and focused on realism.

Soon, you should be able to put in a screenplay and a cast, and get a movie out. Then, "Google Sequels" - generates a sequel for any movie.

by Animats

5/20/2025 at 6:16:29 PM

The swirly effects are probably used to distract from the problems of getting realism right.

by dimal

5/20/2025 at 6:10:34 PM

>Soon, you should be able to put in a screenplay and a cast, and get a movie out.

This "fixes" Hollywood's biggest "issues". No more highly paid actors demanding 50 million to appear in your movie, no more pretentious movie stars causing dramas and controversies, no more workers' unions or strikes, but all gains being funneled directly to shareholders. The VFX industry being turned into a gig meatgrinder was already the canary in the coal mine for this shift.

Most of the major Hollywood productions from the last 10 years have been nothing but creatively bankrupt sequels, prequels, spinoffs and remakes, all rehashed from previous IP anyway, so how much worse than this can AI do, since it's clear they're not interested in creativity anyway? Hell, it might even be an improvement than what they're making today, and at much lower cost to boot. So why wouldn't they adopt it? From the bean counter MBA perspective it makes perfect sense.

by FirmwareBurner

5/20/2025 at 6:38:22 PM

Actors will license their appearance, voice, and mannerisms to these new media projects. (Maybe by established Hollywood studios, maybe not).

Then the first fully non-human (but human-like) actors will be created and gain popularity. The IP of those characters will be more valuable than the humans they replaced. They will be derided by old people as "Mickey Mouse" AI actors. The SAG will be beside themselves. Younger people will not care. The characters will never get old (or they will be perfectly rendered when they need to be old).

The off-screen dramas and controversies are part of the entertainment, and these will be manufactured too. (If there will even be an off-screen...)

This is the future, and we've been preparing for it for years by presenting the most fake versions of ourselves on social media. Viewers have zero expectation of authenticity, so biological status is just one more detail.

It will be perfect, and it will be awful. Kids born five years from now will never know anything different.

by quesera

5/20/2025 at 6:42:13 PM

>Actors will license their appearance, voice, and mannerisms to these new media projects

Very few actors have an appearance or a voice worth a lot in licenses. That's like the top 1% of actors, if that.

I think if done right, humans could also end up getting emotionally attached to 100% AI generated characters, not just famous celebrities.

by FirmwareBurner

5/20/2025 at 6:46:23 PM

Sure, but I'd bet that 1% of actors (of the total pool of SAG on-screen talent membership?) comprise 75%+ of branding/name recognition for consumers.

So the appearance licenses for these 1% are valuable in Stage 1 of the takeover.

The rest are just forgotten collateral damage. Hollywood is full of 'em.

by quesera

5/20/2025 at 6:18:38 PM

> Hollywood's wet dream.

Except it bankrupts Hollywood, they are no longer needed. Of people can generate full movies at home, there is no more Hollywood.

The end game is endless ultra personalized content beamed into people's heads every free waking hour of the day. Hollywood is irrelevant in that future.

by com2kid

5/20/2025 at 6:23:44 PM

Good point, this is indeed a threat to them. Like how many young people are watching streamers now instead of worshiping present day's music, TV or movie star like in the 90's. The likes of Youtube and Twitch could be more valuable than Hollywood.

That's why I think Hollywood is rushing to adopt gen-AI, so they can churn out personalized content faster and cheaper straight to streaming, at the same rate as indie producers.

by FirmwareBurner

5/20/2025 at 6:49:18 PM

I wish I could feed Dan Simmons' books to an AI and watch at my leisure

by pelagicAustral

5/20/2025 at 7:19:07 PM

That's potentially not far in the future. If you can drop a couple of research pdfs and generate a podcast discussion on it, it is even more straightforward to generate a video based on a text. The limit is mostly hardware.

by myth_drannon

5/20/2025 at 8:28:05 PM

Infinite Jest?

by gh0stcat

5/20/2025 at 6:31:00 PM

> Of people can generate full movies at home, there is no more Hollywood.

LLMs have been in the oven for years longer than this, and I'm not seeing any signs of people generating their own novels at home. Well, besides the get-rich-quick grifters spamming the Kindle store with incoherent slop in the hopes they can trick someone into parting with a dollar before they realize they've been had.

by jsheard

5/20/2025 at 7:29:33 PM

Look at view counts for short form videos that are 100% AI generated.

The good "creators" are already making bank, helped by app algorithms matching people up to content they'll find addictive to view.

The content doesn't have to be good it just has to be addictive for 80% of the population.

by com2kid

5/20/2025 at 9:02:26 PM

You're describing the difference between The Godfather and Skibidi Toilet.

by echelon

5/20/2025 at 10:45:27 PM

Do you think platforms like YouTube and tiktok care?

Whatever gets the views.

by com2kid

5/20/2025 at 6:34:32 PM

> I'm not seeing any signs of people generating their own novels at home

Most humans are also not good at writing great scripts/novels either. Just look at the movies that bring in billions of dollars at the box office. Do you think you need a famous novelist to write you a Fast & Furious 11 script?

Sure, there are still great writers that can make scripts that tickle the mind, but that's not what the studios want anymore. They want to push VFX heavy rehashed slop that's cheap to make, easy to digest for the doom-scrolling masses of consumers, and rakes in a lot of money.

You're talking about what makes gourmet Michelin star food but the industry is making money selling McDonals.

by FirmwareBurner

5/20/2025 at 6:23:39 PM

Generating banal stock footage is wildly different than generating a film.

by suddenlybananas

5/20/2025 at 8:24:51 PM

This doesn't necessarily preclude the possibility of making a model that can generate a film. It's still something they can work out. In fact, I wouldn't be surprised if models we're seeing these days are not a necessary first step in that process.

by bilbo0s

5/20/2025 at 8:29:51 PM

I'm not saying it's in principle impossible, but rather I'm saying this doesn't show that it will happen soon.

by suddenlybananas

5/20/2025 at 6:03:47 PM

Definately plausible.

All this is in line with my prediction for the first entirely AI generated film (with Sora or other AI video tools) to win an Oscar being less than 5 years away.

And we're only 5 months in.

https://news.ycombinator.com/item?id=42368951

by colesantiago

5/20/2025 at 8:15:47 PM

> to win an Oscar being less than 5 years away.

You're assuming Oscar voting is primarily driven by film quality but this hasn't been true for a long time (if it ever was). Many academy voters are biased by whatever cultural and political trends are currently ascendant among the narrow subset of Hollywood creatives who belong to the academy (the vast majority of people listed in movie credits will never be academy voters). Due to the widespread impact of Oscar wins in major categories, voters heavily weight meta-factors like "what should the Hollywood community be seen as endorsing?"

No issue in recent memory has been as overwhelmingly central as AI replacing creatives among the Hollywood community. The entire industry is still recovering from the unprecedented strikes which shut down the industry and one of the main issues was the use of AI. The perception of AI use will remain cultural/political poison among the rarified community of academy voters for at least a decade. Of course, studios are businesses and will hire vendors who use AI to cut costs but those vendors will be smart enough to downplay that fact because it's all about perception - not reality. For the next decade "AI" will be to Academy-centric Hollywood what "child labor" is to shoe manufacturing. The most important thing is not that it doesn't happen, it's ensuring there's no clear proof it's happening - especially on any movie designed to be 'major category Oscar-worthy' (such films are specifically designed to check the requisite boxes for consideration from their inception). predict that in the near-term AI in the Oscars will be limited to, at most, a few categories awarded in the separate Technical Oscars (which aren't broadcast on TV or covered by the mainstream media).

by mrandish

5/20/2025 at 7:17:21 PM

Oscar in what category?

We are about six years into transformer models. By now we can get transformers to write coherent short stories, and you can get to novel lengths with very careful iterative prompting (e.g. let the AI generate an outline, then chapter summaries, consistency notes, world building, then generate the chapters). But to get anything approaching a good story you still need a lot of manual intervention at all steps of the process. LLMs go off the rail, get pacing completely wrong and demonstrate gaping holes in their understanding of the real world. Progress on new models is mostly focused in other directions, with better storytelling a byproduct. I doubt we get to "best screenplay" level of writing in five years.

Best Actor/Actress/Director/etc are obviously out for an AI production since those roles simply do not exist.

Similar with Best Visual Effects, I doubt AI generated films qualify.

That leaves us with categories that rate the whole movie (Best Picture, Best International Feature Film etc), sound-related categories (Best Original Score, Original Song, Sound) and maybe Best Cinematography. I doubt the first category is in reach. Video Generation will be good enough in five years. But editing? Screenwriting? Sound Design?

My bet would be on the first AI-related Oscar to be for an AI generated original score or original song, and that no other AI wins Oscars within five years.

Unless we go by a much wider definition of "entirely AI generated" that would allow significant human intervention and supervision. But the more humans are involved the less it has any claim to being "entirely AI". Most AI-generated trailers or the Balenciaga-Potter-style videos still require a lot of human work

by wongarsu

5/21/2025 at 3:03:37 AM

I just think the entire framing is wrong.

I have done quite a bit with AI generated audio/sound/music.

At some point in the process, the end result feels like your own and the models were used to create material for the end work.

At some point, using AI in the creative process will be such a given that it is left unsaid.

I would assume the screen play next year that wins the Oscar will have been helped with the aid of a language model. I can't imagine a writer not using a language model to riff on ideas. The delusional idea here is the prompt "write an Oscar winning screenplay" and that somehow that is all there is going to the creative process.

by rxtexit

5/20/2025 at 6:32:59 PM

I would believe that an AI generated film will never win an Oscar.

I bet they will soon add rules that AI movies can't even compete on it.

by zanellato19

5/20/2025 at 6:39:38 PM

Separate categories

by zombiwoof

5/20/2025 at 6:29:27 PM

AI trailers already exist: https://www.youtube.com/playlist?list=PL_52fVxPZcIiEvGocuVn6...

by esafak

5/20/2025 at 6:53:06 PM

I feel like we should probably draw a distinction between "AI trailers exist as a replacement for traditional trailers" and "AI trailers exist because they're the clickbait du-jour for cynical social media engagement farmers". For now they're 100% the latter.

by jsheard

5/21/2025 at 2:08:21 PM

I'm excited about this.

Think of all of your favorite novels that are deemed "impossible" to adapt to the screen.

Or think of all the brilliant ideas for films that are destined to die in the minds of people who will never, ever have the luck or connections required to make it to Hollywood.

When this stuff truly matures and gets commoditized I think we are going to see an explosion of some of the most mind blowing art.

by skc

5/21/2025 at 2:18:52 PM

It's already difficult enough to make a successful book adaptation, even WITH authorial intent. Can't imagine that hours of patchwork AI-generated video, with all its artifacting and consistency errors, will fare any better than "The Rings of Power".

by flmontpetit

5/21/2025 at 2:43:36 PM

I think not yet, but it is coming.

I can see it using some form of PEFT so that the output becomes consistent with both the setting and the characters and then it is about generating over and over each short segment until you are happy with the outcome. Then you stitch them together and if you don't like some part you can always try to re-generate them, change the prompt, ...

by marcyb5st

5/22/2025 at 2:39:07 PM

I don't believe we will live to see the day where these models can replace a competent production team. At best they'll be what LLMs are to creative writing, which has so far only conclusively replaced low effort blogspam and fraud/plagiarism.

by flmontpetit

5/20/2025 at 8:06:32 PM

I think it's a good thing to have more people creating things. I also think it's a good thing to have to do some work and some thinking and planning to produce a work.

by brm

5/20/2025 at 5:53:43 PM

I don't care about AI animals but the old salt offended me.

by IncreasePosts

5/21/2025 at 10:46:41 AM

tbh, wasnt that impressed maybe its cause social media has been heavily marketing out all these things in bulkkk and moreover, at this point, it just feels one company copying what the other released, even the names feel not original?

by onlyreal_1

5/21/2025 at 12:53:23 PM

I generally think that Kling or even Runway has achieved the visual fidelity of Veo (flaws and all, physics problems and direction of action and such), but now people are basically experiencing sensory bias where they think that some things about the visuals make better sense because nw it has sound as an added context. Visually, yeah. Probably on par with Kling, possibly worse on the depction of dynamic action

by horhay

5/20/2025 at 6:16:13 PM

How do you use Imagen 4 in Gemini? I don't see it in the model picker, I just 2.5 Flash and 2.5 Pro (Upgrade).

by sergiotapia

5/20/2025 at 6:25:44 PM

It's not at all obvious from Gemini - probably the easiest way is through Whisk.

https://labs.google/fx/tools/whisk

by vunderba

5/20/2025 at 11:22:55 PM

Whisk is only available in certain countries though, unlike Gemini.

by cubefox

5/20/2025 at 6:38:43 PM

Have they reveled anything similar to Claude Code yet? I sure hope they are saving that for I/O next month... this video/photo reveals are too gimmicky for my liking, alas I'm probably biased because I don't really have a use for them.

by pelagicAustral

5/20/2025 at 6:40:39 PM

https://jules.google/ posted here today https://news.ycombinator.com/item?id=44034918

by dmd

5/20/2025 at 6:46:09 PM

Yeah, I saw that... not quite the same... I used it for a bit but it's more like an agent that clings to a Github repo and deals with tickets up there, can't really test live on local, it just serves a different purpose.

by pelagicAustral

5/20/2025 at 6:56:13 PM

Google I/O is happening right now. This is one of the announcements, I believe.

by lxgr

5/20/2025 at 6:37:52 PM

On a technical level, this is a great achievement.

On a more societal level, I'm not sure continuously diminishing costs for producing AI slop is a net benefit to humanity.

I think this whole thing parallels some of the social media pros and cons. We gained the chance to reconnect with long lost friends—from whom we probably drifted apart for real reasons, consciously or not—at the cost of letting the general level of discourse to tank to its current state thanks to engagement-maximizing algorithms.

by airstrike

5/20/2025 at 6:48:46 PM

Future is not bright. While we are endlessly talking about details reality is that AI is taken over so many jobs.

Not in 10 years but now.

People who just see this as terrible are wrong. AI improving curves is exponential.

People adaptability is at best linear.

This makes me really sad. For creativity. For people.

by sebau

5/20/2025 at 6:55:04 PM

Maybe. The internet was also exponential, and while it has its drawbacks, I think it's resulted in a huge increase in creativity. The world looks very different than it did 30 years ago, and I think mostly for the better.

by mindvirus

5/20/2025 at 6:55:05 PM

> Future is not bright. While we are endlessly talking about details reality is that AI is taken over so many jobs.

Of course this is not because of AI. It's because of the ridiculous system of social organization where increased automation and efficiency makes people worse off.

by jampekka

5/22/2025 at 2:46:53 PM

Time for the Butlerian Jihad

by elzbardico

5/20/2025 at 7:09:54 PM

What’s the easiest way to try out Imagen 4?

Edit: https://labs.google/fx/tools/whisk

by skybrian

5/21/2025 at 3:42:26 AM

I have some base knowledge about diffusion/dit, I am so curious about how this can be done. Do you know some resources in this field? THANKS!

by celespider

5/21/2025 at 11:01:32 AM

Stability is conspicuously absent from the imagen benchmarks. I assume that means it's significantly better

by nprateem

5/20/2025 at 6:46:24 PM

what do they use to train these models? youtube videos?

by pier25

5/20/2025 at 7:41:30 PM

Wow, the audio integrations really makes a huge difference, especially given it does both sounds and voices

Can’t wait to see what people start making with these

by nico

5/20/2025 at 7:10:44 PM

How does this compare with sora (pro)?

by flakiness

5/20/2025 at 9:04:32 PM

Sora, the video model, is shit. Kling, Runway, and a whole host of other models are better. You don't have to do much to be better than Sora.

Sora, the image model (gpt-image-1), is phenomenal and is the best-in-class.

I can't wait to see where the new Imagen and Veo stack up.

by echelon

5/21/2025 at 1:43:27 AM

Well all this is great from a technology point of view. But what about millions of jobs in the film industry in animation, motion artists etc? Why is it feeling like few humans are making sure others stop eating and living a good life?

by methuselah_in

5/21/2025 at 1:53:22 AM

Do you feel the same way about all the human computers that computers put out of work?

by IncreasePosts

5/21/2025 at 2:12:43 AM

what exactly do you do that AI won't take it over? or are you one of those "AI artists"? you do know their end goal would be to replace you as the "prompter" with AI and have auto generated content for everyone?

by emkoemko

5/21/2025 at 3:56:37 PM

I'm a computer programmer and I'm not really worried about it. If it can take my job, then it can take a while slew of other jobs too, and society is in for a big upheaval.

by IncreasePosts

5/21/2025 at 1:50:41 AM

https://en.wikipedia.org/wiki/Luddite

by danabenson

5/21/2025 at 8:12:31 AM

You are probably meaning that as a slur.

In reality Luddites did not oppose technology per-se, but the dramatic worsening of the working conditions in the factories, reduced wages and concentration of the income to the capital holders. These are the same problems that should be addressed contemporarily.

They initially tried to address these by political means. But with that failing they moved to sabotage and violence.

https://www.smithsonianmag.com/innovation/when-robots-take-j...

by jampekka

5/21/2025 at 2:11:43 AM

This is coming for everyone's jobs. It'd be possibly an OK or good thing after some adaptation if I didn't suspect that the people with power during this transition were nihilists or people who's mission in life is to be relatively rather than absolutely well off. If everyone can have what they need they will not feel important enough

by sidibe

5/21/2025 at 1:47:22 AM

[dead]

by codezero

5/20/2025 at 7:07:50 PM

When can I change the camera view and have everything stay consistent?

by ugh123

5/20/2025 at 7:22:08 PM

Thanks to them, we will be able to enter new era of politics. Where nothing is true, and everything is vibe based.

Thank you, researchers, for making our world worse. Thank you for helping to kill democracy.

by StefanBatory

5/20/2025 at 6:35:47 PM

I'm surprised at how bad these are

by bowsamic

5/20/2025 at 11:44:29 PM

All my Veo 3 videos has sound missing. No idea why. Seems like a common problem.

by kumarm

5/20/2025 at 6:58:59 PM

Well, all the AI labs wanted to "Feel the AGI" and the smoke from Google...

They all got smoked by Google with what they just announced.

by rvz

5/21/2025 at 2:41:26 PM

Google's been coooooking

by clarkcharlie03

5/21/2025 at 2:12:53 PM

Well this is terrifying

by impalallama

5/20/2025 at 7:32:50 PM

is it still a waitlist?

by htrp

5/20/2025 at 7:18:48 PM

Ehh, really for 20$. Break dancers with no music, people just pop in and out ?

Google what is this?

How would anyone use this for a commercial application.

by 999900000999

5/20/2025 at 7:25:36 PM

"The Bloomberg terminal for creatives"

by matthewaveryusa

5/21/2025 at 2:52:46 AM

Google hit the jackpot with their acquisition of YouTube and it's now paying dividend. YouTube is the largest single source of data and traffic on the Internet, and it's still growing fast. I think this data will prove incredibly important to robotics as well. It's a shame they sold Boston Dynamics in one of their dumbest ever moves because of bad PR.

by _ncuy

5/21/2025 at 3:55:31 AM

"Growing fast" is questionable these days.

There is an ever growing percentage of new AI-generated videos among every set of daily uploads.

How long until more than half of uploads in a day are AI-generated?

by brunoborges

5/21/2025 at 4:46:11 AM

Even if the content was 100% AI generated (which is the furthest thing from reality today) human engagement with the content is a powerful signal that can be used by AI to learn. It would be like RLHF with free human annotation at scale.

by codelord

5/21/2025 at 9:33:15 AM

Won't the human engagement be replaced by AI engagement too? if it isn't already being replaced?

by dom96

5/21/2025 at 9:46:04 AM

The AI is not paying for watching videos yet

by thomashop

5/21/2025 at 10:27:55 AM

Indeed, it's the advertisers who are paying for AI to watch videos....

by DrScientist

5/21/2025 at 10:48:07 AM

And paying for my sofa to watch a unskippable 50s ad while I make a coffee.

by ben_w

5/21/2025 at 12:48:00 PM

Back in the day when everyone used to watch broadcast TV, and stations synchronised their add breaks, water consumption would spike with every add break.

by mr_toad

5/21/2025 at 3:21:10 PM

The UK has a unique problem with demand spikes for electricity during commercial breaks, due to the British penchant for using high-power electric kettles to make tea. In the worst case, demand could rise and fall by gigawatts within a matter of minutes.

https://en.wikipedia.org/wiki/TV_pickup

https://www.youtube.com/watch?v=slDAvewWfrA

by jdietrich

5/21/2025 at 3:11:03 PM

Google already invests a tremendous amount of resources into identifying and preventing fraudulent ad impressions -- I don't see that changing much until AI is so cheap that it makes sense to run a full agent for pennies per hour. Sadly.

by bbor

5/22/2025 at 11:12:41 AM

Not talking about fraud per se - in the sense of trying to drive revenue for a particular video channel - just that if you wanted to train AI on youtube videos you are in effect getting the advertisers to pay for the serving of them.

Perhaps the difference here is the behaviour would be much more human and thus harder to detect using current fraud detection?

by DrScientist

5/21/2025 at 2:54:25 PM

Yes it will. Soon humans will be the minority on the internet. I wrote some guesses about this 2 years ago: https://art.cx/blog/12-08-22-city-of-bots

by artursapek

5/21/2025 at 7:08:08 AM

And google is in the best possible position to detect it if they want to exclude it from their datasets.

by franga2000

5/21/2025 at 7:10:25 AM

They're never going to manage to do that, just on a technical level

Plus some users might want to legitimately upload things with AI-generated content in it

by sebstefan

5/21/2025 at 9:54:45 AM

I'm pretty sure YouTube saves the metadata from all the video files uploaded to it. It seems pretty trivial to exclude videos uploaded without camera model or device setting information. I seriously doubt even a tiny fraction of people uploading AI content to YouTube are taking the time to futz about with the XMP data before they upload it. Sure, they'll miss out on a lot of edited videos doing that, but that's probably for the best if you're trying to create a data set that's maintaining fidelity to the real world. Lots of ways to create false images without AI

by derektank

5/21/2025 at 10:39:17 AM

"Since launching in 2023, SynthID has watermarked over 10 billion images, videos, audio files and texts, helping identify them as AI-generated and reduce the chances of misinformation and misattribution. Outputs generated by Veo 3, Imagen 4 and Lyria 2 will continue to have SynthID watermarks.

Today, we’re launching SynthID Detector, a verification portal to help people identify AI-generated content. Upload a piece of content and the SynthID Detector will identify if either the entire file or just a part of it has SynthID in it.

With all our generative AI models, we aim to unleash human creativity and enable artists and creators to bring their ideas to life faster and more easily than ever before."

From the page linked in the post....

So there's different ways to detect AI generated content (videos/images atleast). (https://www.nature.com/articles/s41586-024-08025-4 <-- paper on synthID / watermarking and detecting it with LLMs)

by sim7c00

5/21/2025 at 2:41:53 PM

I somewhat doubt that YT cares much about AI content being uploaded, as long as it’s clearly marked as such.

What they do care about is their training set getting tainted, so I imagine they will push quite hard to have some mechanism to detect AI; it’s useful to them even if users don’t act on it.

by theptip

5/21/2025 at 7:19:39 AM

> They're never going to manage to do that, just on a technical level

Why not? Given enough data, it's possible to train models to differentiate - especially since humans can pick up on the difference pretty well.

> Plus some users might want to legitimately upload things with AI-generated content in it

Excluding videos from training datasets doesn't mean excluding them from Youtube.

by Timon3

5/21/2025 at 8:14:29 AM

I agree, especially because in practice the vast majority of AI-generated videos uploaded to YouTube are going to be from one of about 3 or 4 generators (Sora, Veo, etc.). May change in the future, but at the moment the detection problem is pretty well constrained.

by _delirium

5/22/2025 at 12:21:57 PM

> Excluding videos from training datasets doesn't mean excluding them from Youtube.

Ah then sure. It was this part that was problematic.

If users are still allowed to upload flagged content, then false positives almost don't matter, so Youtube could just roll out some imperfect solution and it would be fine

by sebstefan

5/21/2025 at 2:49:30 PM

[dead]

by throawayonthe

5/21/2025 at 4:00:38 AM

In the future, a new intelligent species will roam the earth, they will ask, "why did their civilization fall?" The answer? These homo-sapiens strip mined the Earth and exacerbated climate change to generate enough power to make amusing cat videos...

by bamboozled

5/21/2025 at 6:41:50 AM

It’s the much-feared the paper clip apocalypse, but we did it to ourselves with cat clips.

by jonplackett

5/21/2025 at 12:13:13 PM

At this point heat death through cat videos sound more appealing than nuclear apocalypse, lol

by Cthulhu_

5/21/2025 at 4:31:47 AM

And those videos were either not watched by anyone human or not truly watched by being part of an endless feed of similar slop.

by Duralias

5/21/2025 at 8:56:25 AM

how do you truly watch an ai-generated cat video

by akho

5/21/2025 at 10:33:56 AM

use your eyes. write a detailed and elaborate review on your blog of the cat and his antics. seems easy enough?

by sim7c00

5/21/2025 at 2:57:34 PM

We don't have an energy problem on earth. We have a capitalism problem.

Renewable energy is easily able to provide enough energy sustainable. Batteries can be recycled. Solar panels are glas/plastic and silicium.

Nuclear is feasable, fusion will happen in 50 years one way or the other.

Existens is what it is. If it means being able to watch cat videos, so be it. We are not watching them for nothing, we watch them for happiness.

by Flamentono2

5/21/2025 at 9:42:20 PM

Existens is what it is. If it means being able to watch cat videos, so be it. We are not watching them for nothing, we watch them for happiness.

Well that's just your opinion.

Yes we can generate electricity, but it would be nice if used it wisely.

by bamboozled

5/23/2025 at 8:30:57 AM

Of course its my opinion, its my comment after all.

Nonetheless, survival can't be the life goal after all the moon will drift away from earth in the future, the sun will explode and if we survive that as a species, all bonds between elements will disolve.

It also can't be about giving your dna away because your dna has very little to no impact over just a handful of generations.

And no the goal of our society has to be to have as much energy available as possible to us. So much energy, that energy doesn't matter. There is enough ways of generating energy without a real issue at all. Fusion, renewable energy directly from the sun.

There is also no inherant issue right now preventing us all having clean stable energy besides capitalsm. We have the technology, we have the resources, we have the manufacturing capacity.

To finish my comment: Its not about energy, its about entropy. You need energy to create entropy. We don't even consume the energy of the sun, we use it for entropy and dissipate it back to space after.

by Flamentono2

5/21/2025 at 1:17:15 PM

I hear BD aren't making much money anyway so I wonder if they couldn't just buy them back for not much loss overall.

by qoez

5/21/2025 at 4:59:03 AM

On the other hand, take one look at the way they caption a video in their dataset, and you have seen like 90% of the "secret sauce" of generative art. All this supposed data and knowledge, and anyone who has worked 1 day on Imagen or Veo could become a serious competitor.

The remaining 10% is the solution to generating good hands, of course. And do you think YouTube has been helping anyone achieve that?

by doctorpangloss

5/21/2025 at 8:37:11 AM

Why videos are important for robotics?

by informal007

5/21/2025 at 10:57:13 AM

If you can generate realistic video stream, responding to player movements and interactions, you can train your robot using that video stream. It's much more scalable, compared to building physical environments and performing real-world training.

Of course the alternative is to use game engines, but it's possible that AI would generate more realistic video stream for the same money spent. Those recent AI-generated videos certainly look much more realistic than any game footage I ever saw.

by vbezhenar

5/21/2025 at 12:57:59 PM

Game engines require a lot of additional work to make them suitable for that task, too— deep integration for sensor data, inputting maps and assets, plus the basic mismatch that these workflows are centered around Windows gui tools whereas robotics is happening on the Linux command line.

by mikepurvis

5/21/2025 at 10:34:17 AM

object detection i'd guess.

by sim7c00

5/21/2025 at 7:46:00 AM

Why should YouTube be here at the advantage? Every competitor also has access to these videos(?)

by mrklol

5/21/2025 at 8:02:56 AM

Easy access to the videos without having to download them from Google (and without Google trying to stop you from scraping them, which they will) is an enormous advantage. There's way, way too much on Youtube for to index and use over the internet, and especially not at full resolution.

by aloha2436

5/21/2025 at 3:08:41 PM

That is the other perk, Google has all those videos stored in original quality locally.

It wouldn't be hard for google to poison competitor training just by throttling bandwidth.

by Workaccount2

5/21/2025 at 1:18:50 PM

Google is making money hosting these videos, and users are freely uploading them. A competitor would have to scrape/download them, store them, process them all at their own cost, along with having much less metadata available (Which videos are most viewed, which segments, what do people repeat, what do people skip, what do people watch after this video, which video generates the most ad revenue, etc.)

by jclardy

5/21/2025 at 2:59:44 PM

> Google is making money hosting these videos

This isn't certain. Google do not break out Youtube revenues nor costs. Hosting this amount of videos, globally, redundantly, the vast majority of which are basically never watched, cannot be cheap.

It's entirely plausible that Google's wider benefit from Youtube (such as training video generation algorithms and better behaviour tracking for better targeted ads across the internet) are enough to compensate for Youtube in particular losing money.

by sofixa

5/21/2025 at 3:43:58 PM

> Google do not break out Youtube revenues nor costs.

Google does break out Youtube revenue.

Latest 10-K: https://abc.xyz/assets/77/51/9841ad5c4fbe85b4440c47a4df8d/go...

See page 10, for youtube Ads revenue.

by 1024core

5/22/2025 at 4:00:51 AM

My bad, I thought it's the two. But they don't break out costs, so in reality we don't know if YouTube is profitable or not.

by sofixa

5/21/2025 at 3:48:47 PM

Videos without metadata is not as useful. Google also has details on which videos are watched where. Which parts do people skip. All the videos that are blocked for various reasons. The performance of videos with humans over time and so on. They can focus on videos with signals that indicate that humans prefer those videos or clips.

by bongoman42

5/21/2025 at 8:02:12 AM

Do they, though? Are competitors actually downloading all these videos? Supposedly there are 5 billion videos on YouTube (https://seo.ai/blog/how-many-videos-are-on-youtube), downloading all of that is a LOOOOT of data and time.

I mean, you could limit yourself to the most popular or most interesting 100 million, but that's still an enormous amount of data to download.

by akie

5/21/2025 at 8:06:40 AM

Just wanted to mention the latter, you don’t need all videos. It’s indeed a lot of data but doable so I am not sure if I would count this as big advantage.

by mrklol

5/21/2025 at 1:44:50 PM

You are incredibly naive if you don’t see full, unrestricted access to YT as an advantage.

by qudat

5/21/2025 at 10:46:44 AM

presumed datasets: 1. its petabytes of data in the public/listed/free tier videos. 2. there's paywalled videos. 3. there's private/unlisted videos.

google will have access to all of these. competitors will have to do tons of network interactions with google to pull in only the first set. (which google could detect and block depending on how these competitors go about it)

by sim7c00

5/21/2025 at 8:59:31 AM

Most youtube videos use stock video photography. Or the face of some youtuber.

If we look at the Veo 3 examples, this is not the typical youtube video, but instead they seem to recreate cgi movies, or actual movies.

by seydor

5/20/2025 at 6:02:26 PM

Of course they had to name a film making proprietary tool with the name of an award winning film made using open-source tools released less than a year ago...

by phh

5/20/2025 at 6:13:45 PM

"Flow" is one of the most generic names in tech. I can think of 10+ products called that off the top of my head.

by paxys

5/20/2025 at 6:45:46 PM

There's no way they named their AI filmmaking tool after the last winner of the Academy Award for Best Animated Feature by accident.

by debugnik

5/20/2025 at 6:38:10 PM

I still remember a style transfer paper which proudly mimicked a popular artist who had passed away barely a few years before (Qinni). Many AI researchers seemingly want to wear the skins of the people they rip off.

by debugnik

5/20/2025 at 7:02:37 PM

Seems pretty obvious that they named it after Facebook's JS type checker from 2015

by woah

5/20/2025 at 11:49:41 PM

[dead]

by merillecuz56

5/20/2025 at 11:34:33 PM

[dead]

by fefawfefafds

5/20/2025 at 11:38:53 PM

[dead]

by fdaffeafe

5/20/2025 at 6:09:44 PM

Like most AI image or video generation tools, they produce results that look good at first glance, but the more you watch, the more flaws and sloppiness you notice, and they really lack storytelling

by quantumHazer

5/20/2025 at 6:23:19 PM

They don't have to be as good as the best film production team - they just need to be better than the average/B-grade ones to gain adoption.

With the media & entertainment hungry world which is about to get worse with the unempoyed/underemployed tiktok generation needing "content", something like this has to have a play.

by harikb

5/20/2025 at 6:26:16 PM

Or replace cutscenes in video games, or short videos on ads on mobile (already small and people are barely paying attention)

Nowadays when I randomly open a news website to read some article, at the bottom of the page all the generic "hack to lose your belly" or "doctors recommend weird japanese device" or "how seniors can fly business class", I've been noticing lately 1/3rd of the images seem to be AI generated...

by inerte

5/20/2025 at 6:28:18 PM

God, what a dismal future we're building.

by nathan_compton

5/20/2025 at 6:33:31 PM

Warning: browsing the Web without an ad blocker is hazardous to your mental health. If you regularly see ads permeating most web pages, and don't know how to avoid that, you may need to see a specialist.

by nine_k

5/20/2025 at 6:40:47 PM

Perhaps some of us choose not to rip off payments that are due to the people who provide a service that we're using. If I visit a website that's so infested with ads that I don't like being there, it's on me to stop visiting that website.

I simply don't think it's fair to cheat service providers when we don't like their service. You have a choice, and that choice is to not use that service at all. They're providing it under the terms that it is ad-supported. If you don't want to support it, but you still want to use it, then you're cheating someone. That is dishonest and unethical.

by AStonesThrow

5/20/2025 at 7:28:45 PM

If ad-supported sites would return an optional response header indicating such, and their opinion on adblocking visitors. E.g.:

  Advertisement-Permission: [required|requested]

And my adblockers had a config option to abort pageloads with an appropriate error message, if `required` or `requested`, then I would use it happily.

In the meantime, I'm browsing every site with all content blockers set at maximum, because any other choice is incomprehensible on the modern web.

If I consequently visit some sites that want me to consume advertising of which I am unaware, then that is entirely their issue, not mine.

by quesera

5/20/2025 at 6:41:28 PM

Jon Stewart did a gag in this week's episode where there were happy meal toy versions of a bunch of congressmen on screen for a few seconds.

A lot of content is like this - you just need an approximation to sell an idea, not a perfect reproduction. Makes way more sense to have AI generate you a quick image for a sight gag than to have someone spend all day trying to comp it by hand. And as AI imagery gets more exposure in these sort of scenarios, more people will be accustomed to it, and they'll be more forgiving of its faults.

The bar for "good enough" is gonna get a lot lower as the cost of producing it comes way down with AI.

by bsimpson

5/20/2025 at 6:30:15 PM

But you don't have to outsource 100% of your creative work to your tools. This is a toolbox, not a complete automatic masterpiece generator. If you want serious production, don't remove yourself from the loop.

Drive the storytelling, consult with AI on improving things and exploring variations.

Generate visuals, then adjust / edit / postprocess them to your liking. Feed the machine your drawings and specific graphic ideas, not just vague words.

Use generated voices where they work well, record real humans where you need specific performance. Blend these approaches by altering the voice in a recording.

All these tools just allow you to produce things faster, or produce things at all such that would be too costly to shoot in real life.

by nine_k

5/20/2025 at 6:51:49 PM

AI used to be quite bad at coding just - what - 2 years ago?

Now it's "good enough" for a lot of cases (and the pace of improvement is astounding).

AI is still not great at image gen and video gen, but the pace of improvement is impressive.

I'm skeptical image, video, and sound gen are "too difficult" for AI to get "good enough" at for many use cases within the next 5 years.

by onlyrealcuzzo

5/23/2025 at 9:46:30 AM

it's still quite bad though

by quantumHazer

5/20/2025 at 6:30:49 PM

I don’t get how someone can look at these videos and think “wow there’s lots of flaws and it’s sloppy and no storytelling” rather than “holy smokes this stuff is improving fast!”

In 2 years we have moved from AI video being mostly a pipe dream to some incredible clips! It’s not what this is like now, but what will it be like in 10 years!

by Closi

5/21/2025 at 12:57:31 PM

In the era of more software advancements, people have perceived the progress of tech to be incremental more than the opposite. I think even tech bigwigs of the past have said extrapolation is a futile endeavor because technology is unpredictable.

by horhay

5/21/2025 at 5:56:47 PM

Bill Gates saw computers and extrapolated that there would be one in every home, Gordon Moore produced Moores Law which literally extrapolated compute in the 60s and almost still holding, and you can extrapolate that GPT-5 will perform better than GPT-4, and that there will probably be a GPT-6 that will be even better than that.

Extrapolating that technology will get better in the future when it has got better in the past isn’t a sure bet, but it’s a reasonably reliable one.

by Closi

5/23/2025 at 9:49:47 AM

No, past performances are not an indicator of future performance. Sometimes it happens to be, but many other times it does not.

by quantumHazer

5/23/2025 at 10:20:29 AM

I think this is a strange take - Technological advancements aren't mutual funds.

Of course it's an indicator of future performance - Not a guarantee, but certainly a indictator.

by Closi

5/20/2025 at 6:11:24 PM

You don’t even have to look close for some of these. The owl suddenly flipping direction in the first video was jarring

by superb_dev

5/20/2025 at 6:25:39 PM

I think that's just the silhouette illusion[1]. In this case likely abetted by the framing elements moving near the edges.

[1] - https://en.wikipedia.org/wiki/Spinning_dancer

by llm_nerd

5/20/2025 at 6:25:56 PM

It doesn't flip, it's an illusion. The owl is always facing the camera.

by Workaccount2

5/20/2025 at 6:20:30 PM

When it's in silhouette you don't know what direction it is facing, technically. I think what's happening is when you see a shot of something flying in front of something prominent (in this case, the moon), your brain naturally perceives it is going away from the camera and toward the object.

by billyp-rva

5/20/2025 at 6:12:25 PM

Yeah the owl ""animation"" is terrible, I bet they could have found better examples? If it wasn't the case I don't know what to think

by quantumHazer

5/20/2025 at 6:19:20 PM

It looked to me like the owl was turning around to land.

by JamesBarney

5/20/2025 at 6:25:24 PM

Yea, but we're early days and I think that will go away as the tools get better. Also - did you watch the sample short films they have?

by jhaile

5/20/2025 at 6:58:06 PM

You should see the terrible results it's possible to generate with AfterEffects, Blender, Houdini, etc..

by spiderice

5/20/2025 at 7:05:31 PM

[flagged]

by seydor

5/20/2025 at 6:06:19 PM

[flagged]

by jonahx

5/20/2025 at 11:31:58 PM

[flagged]

by fasdfdsa

5/20/2025 at 7:19:43 PM

I do find myself wondering if the people working on this stuff ever give any real thought to the impact on society that this is going to have.

I mean obviously the answer is "no" and this is going to get a bunch of replies saying that inventors are not to blame but the negative results of a technology like this are fairly obvious.

We had a movie two years ago about a blubbering scientist who blatantly ignored that to the detriment of his own mental health.

by lenerdenator

5/20/2025 at 7:21:31 PM

It's really being forced on us too. Jira, Confluence, and Notion are three products I've used where they've purposefully ignored requests to allow us to disable or hide the bundled generative AI. It's really intrusive. I also switched to Duck Duck Go because of the new AI on Google

by bowsamic

5/20/2025 at 7:21:21 PM

How could you possibly push back on the societal benefit of a director being able buy a vacation home in Lake Tahoe?

by tmpz22

5/20/2025 at 10:58:43 PM

And what about the rest of human pyramid working under the director employed in these productions?

by briankelly

5/20/2025 at 10:10:35 PM

Remember when they fired Timnit Gebru for publishing on AI safety?

by tootie

5/21/2025 at 12:23:33 AM

Quite a narrow view to interpret what happened there as firing Gebru for publishing on AI safety. Google still conducts and publishes research on AI safety, just without Gebru who helpfully offered to resign if Google didn't name her critics.

by themacguffinman

5/21/2025 at 1:59:28 AM

Ethics, not safety. AI safety as a term became a big focus after that, largely pushed by big AI vendors, and largely as a way of refocusing attention away from the issues being raised by people working in AI ethics.

by dragonwriter

5/20/2025 at 11:25:20 PM

> Flow is not available in your country yet.

A bit depressing.

by Lucasoato

5/20/2025 at 11:05:51 PM

Love flow tv ! Absolutely blown away by the improvements on these models, and also the channel interface was not bad and quite smooth.

I cant be the only one wondering where the swedish beach volleyball channel is though.

by ionwake

5/20/2025 at 7:29:07 PM

This doesn't look (any?) better than what was shown a year or two ago for the initial Sora release.

I imagine video is a far tougher thing to model, but it's kind of weird how all these models are incapable of not looking like AI generated content. They all are smooth and shiny and robotic, year after year its the same. If anything, the earlier generators like that horrifying "Will Smith eating spaghetti" generation from back like three years ago looks LESS robotic than any of the recent floaty clips that are generated now.

I'm sure it will get better, whatever, but unlike the goal of LLMs for code/writing where the primary concern is how correct the output is, video won't be accepted as easily without it NOT looking like AI.

I am starting to wonder if thats even possible since these are effectively making composite guesses based on training data and the outputs do ultimately look similar to those "Here is what the average American's face looks like, based on 1000 people's faces super-imposed onto each other" that used to show up on Reddit all the time. Uncanny, soft, and not particularly interesting.

by crat3r

5/20/2025 at 7:32:48 PM

It has long been established that Veo has a waaay better understanding of physics, and consistency over multiple frames, than Sora. Not even close.

by ahmedfromtunis

5/20/2025 at 7:36:20 PM

I want to be clear, I don't think Sora looks better. What I am saying is they both look AI generated to a fault, something I would have thought would be not as prominent at this point.

I don't follow the video generation stuff, so the last time I saw AI video it was the initial Sora release, and I just went back to that press release and I still maintain that this does not seem like the type of leap I would have expected.

We see pretty massive upgrades every release between all the major LLM models for code/reasoning, but I was kind of shocked to see that the video output seems stuck in late 2023/early 2024 which was impressive then but a lot less impressive a year out I guess.

by crat3r