Home IT Info News Today Training AI with 139,000 Scripts: The Massive Data Set Anger…

IT Info News Today

Training AI with 139,000 Scripts: The Massive Data Set Anger…

December 17, 2024

236

eWEEK content material and product suggestions are editorially unbiased. We might earn a living if you click on on hyperlinks to our companions. Learn More.

More than 139,000 TV and movie scripts had been transformed into datasets and used to coach AI fashions by Apple, Anthropic, Meta, and Nvidia with out the information of their authors, elevating fears that their inventive work is getting used to coach machines that would probably exchange them.

In addition to the 39,000 TV and movie titles, greater than 53,000 extra films and 83,000 TV episodes had been used to coach AI, together with an unlimited array of Best Picture nominees and TV episodes of The Simpsons, Seinfeld, Twin Peaks, The Wire, The Sopranos, and Breaking Bad.

The dataset “even includes prewritten ‘live’ dialogue from Golden Globes and Academy Awards broadcasts,” stated the Atlantic’s Alex Reisner, who broke the story.

Dialogues as Datasets

The datasets used to coach the AI fashions didn’t comprise the unique scripts, however subtitles extracted, compiled, and uploaded to OpenSubtitles.org. Using subtitles as a substitute of the extra technical scripts is extra regarding to some critics as subtitles supply a extra pure stream of language utilized in dialog.

Generative AI fashions educated on well-written dialogue couldn’t solely mimic movies however generate new ones completely, which suggests AI might conceivably compete with the human writers on whose works it educated with out their permission. This lack of transparency by AI corporations has prompted artists, authors, and publishers to file lawsuits to defend the mental property rights of their inventive outputs.

“For as long as generative-AI chatbots have been on the internet, Hollywood writers have wondered if their work has been used to train them.” Reisner wrote. “The chatbots are remarkably fluent with movie references, and companies seem to be training them on all available sources.” He created a search device for the Hollywood AI database to assist writers decide whether or not their work was used.

Response from Scriptwriters

Unhappy to be taught concerning the alleged theft of their work, Hollywood writers responded angrily, because the WGA and SAG-AFTRA unions have contended the usage of AI in latest strikes.

“I’m livid,” stated David Slack, who wrote the TV present Teen Titans. “I’m completely outraged. It’s disgusting.” Slack found 42 scripts credited to him within the AI database. “It’s a huge amount of my work . . . These are things that I poured my heart and soul into.” Other common writers whose work was used to coach AI included Grey’s Anatomy creator Shonda Rhimes, who had 508 episodes within the dataset; American Horror Story creator Ryan Murphy, who had 346; and Matt Groening—who created The Simpsons and Futurama—who had 742 episodes.

AI’s lack of intentionality makes it unable to supply inventive works solely by itself—quite, it depends on the work of human authors in a manner that many take into account plagiarism. However, the difficulty is much more complicated, as a result of in lots of instances, the studios personal the copyrights of the scripts quite than the writers, giving them even much less company for authorized recourse or compensation.

Learn extra concerning the complicated authorized, moral, and privateness points surrounding generative AI know-how.

Source hyperlink

Post Views: 320

Training AI with 139,000 Scripts: The Massive Data Set Anger…

Dialogues as Datasets

Response from Scriptwriters

LEAVE A REPLY Cancel reply

EVEN MORE NEWS

[Interview] ‘Bixby Will Be Your Go-To Starting Point for

DIGITIMES analyst discusses Apple’s Siri evolution, AI…

Z.ai unveils GLM-5.1, enabling AI coding brokers to run…

POPULAR CATEGORY

Dialogues as Datasets

Response from Scriptwriters

RELATED ARTICLESMORE FROM AUTHOR

Edge clouds and native information facilities reshape IT

Databricks launches Genie Code to automate information science…

AI-augmented knowledge high quality engineering

LEAVE A REPLY Cancel reply

EVEN MORE NEWS

[Interview] ‘Bixby Will Be Your Go-To Starting Point for

DIGITIMES analyst discusses Apple’s Siri evolution, AI…

Z.ai unveils GLM-5.1, enabling AI coding brokers to run…

POPULAR CATEGORY

RELATED ARTICLES MORE FROM AUTHOR