ai AI Video has arrived - It looks incredible

sfoster

Staff Member
Moderator
Click this link to watch some of the videos

Really incredible stuff from OpenAI
There was also a wired article, but the article was so terribly formatted with ads I'm not posting it

Instead here are the different videos from the wired article with a direct link.





 
Re. the AI, I join my fellow Luddites, mlessman and Celtic Rambler. It doesn't bother me. I don't need it. I don't want it.

Although, for people who actually construct films, and probably especially for animators, I can kind of get why it could be troublesome. But for writers, I just can't believe it can ever be any competition.

What the AI will almost certainly discover (fine, certainly discover; fine: 'dude, already there'), are the algorithmic steps designed to tell a story, or, more importantly, to create an emotional experience in a viewer/reader--a genetic core formula extracted from billions of examples, with blanks to be filled in with whatever: a sentient dog, a paranormal chosen-one teen, with whatever.

But writing that is too much like other writing is by definition bad, even it it is technically correct. And this is all the robot can do.
I whole heartily agree that LLM technology isn't going to replace writers.

I view this Sora/Video tech as Cinematographer.
A highly skilled, professional quality cinematographer/set designer that works for pennies on the dollar.

It's not everything, it's not the entire crew, it's not the writer, etc... it's the cameraman/DOP.
It's very normal to delegate some control to your DOP.
 
This is sort of what I meant when I remarked previously that Sora possibly has the potential to be a useful storyboarding tool, and could well become capable of producing a "complete" movie ... in very rough form.

However, from what I've seen of ChatGPT and other machine-learning/pseudoAI algorithmically driven software, there's a fundamental problem with consistency: ask the same question three times and you'll get three different answers (none of which might be appropriate responses). As we saw with the first Sora examples, the generated video was only tenuously faithful to the prompt text. Another recent video commentary on the topic I watched (here) sees the same thing:
Right now, the platform can only create clips that are up to a minute long, because the AI model won't respond to similar prompts in the exact same way. You couldn't combine 60 different one minute AI clips into a coherent movie. So all of these models tend to what they call hallucinate. They sort of go off the beaten path. So the longer the video is, the more likely it is to fall apart.

Now while that's something that could well be improved upon as the software evolves, you're still stuck with an algorithm that will interpret your text in a way that it thinks is reasonable based on years of trawling through random data. If you write "a man stands on the side of a road" the software could give you a 7-foot Maasai warrior standing on 5th Avenue, or an aged French peasant standing on the paved avenue leading to a Roman amphitheatre, surrounded by gladiators.

Obviously the simple solution to this is to be more specific, which - taken to its logical conclusion - means you'll need to craft a properly formatted shooting script for every scene, complete with detailed descriptions of each and every character. And we all know from the "please review my script" threads how challenging this can be. This is why I wouldn't worry about 100-million job losses, because for every digital artist that gets booted off the production team, a new AI-text-prompt-writer will have to be employed.
 
This might be a little off topic, but it relates to the capabilities of robots like Chat GPT.

One of the few things I learned from my higher education is the necessity, in academic writing, of proper documentation. (Including how to build a correct "Works Cited" page--talk about persnickety formatting, lol.) Chat GPT not only doesn't do documentation, it, if I understand it right, is incapable of doing it. It doesn't know, it can't say, why it says what it says.

In my experience it is frequently just flat out wrong. For example, I asked it to summarize a Poe story, Berenice. I don't want to spoil the story--one of Poe's weirdest which has always creeped me out and which is certainly worth a read--but in it, a corpse is exhumed, and GPT wrote:

He eventually digs up her grave and discovers that her body is intact.

This is only true-ish. Yes he digs her up, but he has no memory of it, so nowhere in the story does he "discover" it. Also, the body was just buried, and so of course it was "intact."

When I told it it was incorrect, it apologized, and then got it wrong again:

Eventually, he becomes overwhelmed by his obsession and decides to exhume her.

The truth is that he apparently had beome overwhelmed by his obsession. He doesn't "decide" anything, which is kind of the whole point of the story. I gave it a few more tries. In one, it has the guy waking to find certain items scattered across the floor. And this simply doesn't happen.

Another example: I was wondering how incorrect it would be, in a thing in a thing, to say that JS Bach composed The Well Tempered Clavier while he was in jail in Weimar. So I asked it when and where the WTC was composed, and if it might have been composed in prison. It wrote:

Bach composed "The Well-Tempered Clavier" during different periods of his life, starting around 1722. [...] There is no evidence to suggest that Bach wrote any part of it while he was in prison. In fact, Bach was never imprisoned during his lifetime.

There is an extant autograph manuscript of the WTC that is dated 1722, so of course he didn't "start" it in that year. And I was surprised to hear that the prison episode, part of every biography, was incorrect. I told it I was pretty certain it was wrong, mentioning the work of Christoph Wolff, and it (now seeming defensive: "did you not hear me?") wrote:

I'm sorry for any confusion, but there seems to be a misunderstanding. There's no widely accepted evidence that Johann Sebastian Bach was imprisoned at any point in his life. Christophe Wolff, a renowned Bach scholar, has extensively researched and written about Bach's life, but there is no indication in his works or in other scholarly accounts that Bach was incarcerated.

Dude, I'm looking at the book. This, from Wolff's seminal biography, Johan Sebastian Bach: The Learned Musician;

Apparently for no other reason than a show of anger, the Cöthen capellmeister-designate was kept in jail for nearly four weeks, a period that marked the absolute low point in Bach’s professional life.

And in The New Bach Reader, Wolff's definitive collection of every extant relevant document (sadly, a pretty slim book), he quotes this actual piece of contemporary paper, an excerpt from a court secretary's report:

On November 6 [1717] , the quondam concertmaster and organist Bach was confined to the County Judge's place of detention for too stubbornly forcing the issue of his dismissal and finally on December 2 was freed from arrest with notice of his unfavorable discharge.

I told it again that it was wrong, but it (now a little snotty) stuck to its guns with this bit of word salad:

Christoph Wolff is a notable Bach scholar, but it's important to evaluate the historical evidence within the broader context of Bach's life and times. If there is documentation suggesting Bach's imprisonment in Weimar, it would be advisable to consult reliable historical sources to understand the circumstances and implications of such an event.

What? Dude--Wolff is the reliable source. He is (unlike you) a meticulously scrupulous scholar. But I gave up. What bothers me is the flat out declarative surity it has, even when it is wrong.

What this all means, simply, is this: you can not reliably use Chat GPT for research. It is too transparently, maddeningly, dumb.
 
Last edited:
Specifically on the topic of creativity, Sabine Hossenfelder this week highlights some recent studies showing how "AI" is becomes less creative as it "learns" from its own output, output which has either been homogenised to the point of banality, or spewed out the same kind of error to the extent that it now thinks the false state is the normal one.

This latter point is essentially the same one I made in relation to the various cats we've seen, i.e. the various development teams are racing ahead with ever more technological whizzbangery without fixing fundamental problems, like knowing how many legs a cat has. In the meantime, every six-fingered, double-left-handed man and his five-legged shape-shifting dog is churning out more and more "content" from which the less discriminatory models can mis-learn.

Then again, there are plenty of humans willing to generate and believe all kinds of nonsense, so maybe none of it matters ... :rolleyes:
 
But can AI knock up his housekeeper and ruin his chances of changing the constitution and running for president? Of course not - only Arnold can do that.
Great line and I hear you :) So many other women available to him but he had to go for the most convenient one ....

But the constitution was never going to be changed just for him - that thing is locked in granite. (Translation: I think it's overdue to be amended in many ways and wouldn't object to that change but even if Congress voted for that it would never get ratified by enough states in his lifetime.)
 
@CelticRambler, on that topic, and with comic relief in mind, yes, AI can do things we cannot do. But can AI knock up his housekeeper and ruin his chances of changing the constitution and running for president? Of course not - only Arnold can do that.

If only he had a robot housekeeper to boink instead
 
But the constitution was never going to be changed just for him - that thing is locked in granite.
Way off topic now, but this aspect of the American Dream - wanting to be governed rules laid down by a few privileged white dudes nearly 200 years ago - is something we Old Worlders look at and scratch our heads. :contract:

Here in France, our constitution was updated yesterday, for the 26th time since 1958 (coz the whole thing was scrapped and re-written for the fifth time then); and on Friday in my homeland, the People of Ireland will be invited to approve the 39th and 40th amendments to that charter, drafted in 1922. The Swiss seem to update theirs about three times a week, but they're a bit fanatical about self governance! :D
 
It's going to take me a second to respond to this thread, I've actually been working towards making a response for some time here. I'm not going to debate any of what's been said here, there are plenty of reads and opinions, and I think all are valid from one perspective or another.

The only thing I'll add to this conversation are these 3 observations.

1. Watch the trajectory rather than the position. This tech is moving fast, and while we can all poke the slow AI and laugh at it's basic mistakes and lack of real creativity now, there is at least some reason to believe that it may not be that way for long.

2. Many speak of AI in the generic. We're actually describing a vast field of technologies, each one with a different set of challenges to overcome. LLMs for example might take far longer to reach maturity than text2img, and text2img may be flawless well before text2video, which may well be flawless long before video2video, which is what I'm doing.

3. Indietalk is right, feel however you like about it. People on land don't need to worry about a shark population explosion, and people at sea don't need to worry about collapsing bridge infrastructure. We're all in different circumstances, and the probable future will mean different things for each of us. I'm not an AI zealot, just an AI researcher.

When I do finally get around to finishing a reply to this thread, it won't be text, it will be a video. I've said most of what I have to say in terms of debate already, so what I'm going to do next is simply post a video showing the actual processes at work, running on a server farm, so I can sort of demystify some of these technologies, and help people understand how fundamentally different each one is from the others. I think it might be enlightening for some to just see someone pop the hood on a few of these, and show you where the intake valve connects, etc.
 
Here is a 2-min video on the event:
Apparently the museum already uses a lot of VR and AI in their exhibitions.
This is really an amazing collection...we saw it back in the late 70s when it was housed in part of a corporate office in Cleveland, presented on little partition walls. Since its move to St. Petersburg (and the amazing new building and dome) it has become quite a phenomenon.
 
I've been to that museum - definitely terrific.
Wow - I'm going to have to make the trip. Although there was definitely something very surreal about the works just stuck up on a bunch of portable dividers like this in a barren office setting.

divider.jpg
 
This might be a little off topic, but it relates to the capabilities of robots like Chat GPT.

One of the few things I learned from my higher education is the necessity, in academic writing, of proper documentation. (Including how to build a correct "Works Cited" page--talk about persnickety formatting, lol.) Chat GPT not only doesn't do documentation, it, if I understand it right, is incapable of doing it. It doesn't know, it can't say, why it says what it says.

In my experience it is frequently just flat out wrong. For example, I asked it to summarize a Poe story, Berenice. I don't want to spoil the story--one of Poe's weirdest which has always creeped me out and which is certainly worth a read--but in it, a corpse is exhumed, and GPT wrote:



This is only true-ish. Yes he digs her up, but he has no memory of it, so nowhere in the story does he "discover" it. Also, the body was just buried, and so of course it was "intact."

When I told it it was incorrect, it apologized, and then got it wrong again:



The truth is that he apparently had beome overwhelmed by his obsession. He doesn't "decide" anything, which is kind of the whole point of the story. I gave it a few more tries. In one, it has the guy waking to find certain items scattered across the floor. And this simply doesn't happen.

Another example: I was wondering how incorrect it would be, in a thing in a thing, to say that JS Bach composed The Well Tempered Clavier while he was in jail in Weimar. So I asked it when and where the WTC was composed, and if it might have been composed in prison. It wrote:



There is an extant autograph manuscript of the WTC that is dated 1722, so of course he didn't "start" it in that year. And I was surprised to hear that the prison episode, part of every biography, was incorrect. I told it I was pretty certain it was wrong, mentioning the work of Christoph Wolff, and it (now seeming defensive: "did you not hear me?") wrote:



Dude, I'm looking at the book. This, from Wolff's seminal biography, Johan Sebastian Bach: The Learned Musician;



And in The New Bach Reader, Wolff's definitive collection of every extant relevant document (sadly, a pretty slim book), he quotes this actual piece of contemporary paper, an excerpt from a court secretary's report:



I told it again that it was wrong, but it (now a little snotty) stuck to its guns with this bit of word salad:



What? Dude--Wolff is the reliable source. He is (unlike you) a meticulously scrupulous scholar. But I gave up. What bothers me is the flat out declarative surity it has, even when it is wrong.

What this all means, simply, is this: you can not reliably use Chat GPT for research. It is too transparently, maddeningly, dumb.
Which version of ChatGPT did you use?
 
The (extraordinarily high) energy consumption of AI seems to have suddenly become a hot topic in my corner of YouTube. Up to now, the operators of the various systems have been quite cagey regarding the subject, and attributing a lot of consumption to the (supposedly) one-off cost of teaching/learning. Now we're starting to get some info regarding the cost of each question posed, and - relevant to this forum - each image generated.

One assessment is that generating a single image uses approximately the equivalent power needed to fully charge a mobile phone. Fair enough, not too bad. Unless that can be extrapolated to video generation in a linear fashion - so 24 images for one second of footage, or just short of 1500 charges for one of Sora's one-minute clips.

Now obviously the cost of that will be buried in some kind of subscription, but there's no way the energy companies will give us those electrons for free, and the service providers' shareholders aren't going to be too keen on donating all their profit to wanabee movie-makers. More than likely, the cost of a subscription will settle at a rate that's suspiciously close to the cost of making a movie in the traditional way.
 
The (extraordinarily high) energy consumption of AI seems to have suddenly become a hot topic in my corner of YouTube. Up to now, the operators of the various systems have been quite cagey regarding the subject, and attributing a lot of consumption to the (supposedly) one-off cost of teaching/learning. Now we're starting to get some info regarding the cost of each question posed, and - relevant to this forum - each image generated.

One assessment is that generating a single image uses approximately the equivalent power needed to fully charge a mobile phone. Fair enough, not too bad. Unless that can be extrapolated to video generation in a linear fashion - so 24 images for one second of footage, or just short of 1500 charges for one of Sora's one-minute clips.

Now obviously the cost of that will be buried in some kind of subscription, but there's no way the energy companies will give us those electrons for free, and the service providers' shareholders aren't going to be too keen on donating all their profit to wanabee movie-makers. More than likely, the cost of a subscription will settle at a rate that's suspiciously close to the cost of making a movie in the traditional way.
It's not even remotely that high. I run off about 3-5000 images a day during various A\B testing phases, some animation sequences, some stills, and I'd say my cost is under 5 dollars a day for that specific thing. My chargers draw about 17 watts, and my state of the art flagship GPU is pulling about 400 wats. Training the brains is a different thing though, and yeah, you'll eat through some power training one model. Not anything out of the ordinary vs other uses of electricity. In example, I suspect that one stadium football game uses about enough electricity for AI to translate the library of congress into every known language.

I'd say all AI usage combined right now is probably drawing about as much power as it takes one corporate skyscraper to be air conditioned during the night hours when no one is there. Open AI is the exception, and they will probably need their own reactor at some point. They have over a billion in the bank though, and I think you can set up a clean reactor for about 50 million.

It would make a lot more sense to complain about the power draw of Bitcoin hashing, which is thousands of times more impactful. Of course they pretty much stopped bitcoin mining a few years back. Also if you had a specific need to declare some niche holy war on some type of electrical usage, you'd be shocked by how much power and cost youtube incurs daily. Way, way, way, more than all AI combined right now.

It's just another person somewhere trying to draw attention to themselves by grandstanding about the evils of X. Now give it a few years, and it might change. We're entering an era of unprecedented supercomputing power and usage, and there's no question that power consumption will hit record highs in the near future, but honestly, it's very minor right now.

I'm going to venture a guess and say that none of the people "raising awareness" of this "issue" have even the slightest clue about the actual metrics of this situation. Anyway, yeah, eventually AI's will increase global power usage by a bit. Here's the difference between spending those kilowatts on AI or a football game. The AI can design and provide blueprints for a more efficient power plant, and the football game can't.

I'd add that my image AI draws a lot MORE power than what's typically sold on service websites, and still costs very little. Also, you can find tons of people giving away AI generation services on the web for free. I can't yet accurately estimate the cost of an AI film when my system is fully functional, but I'd guess at around 20 grand for all AI and render cost combined. Machines would probably run you about 20k, but that's a one time expense. Your bigger expenses would come from voice acting and marketing.
 
Back
Top