The Digital Insider | The challenges ahead for generative AI

In 2023, the creative industries have been electrified (stunned and stimulated in equal measure) by Generative AI. The next months will continue the playing out of this tumultuous social change.

Central to immediate debates, we face a potential major juncture in the history of intellectual property rights. Should big tech, or anyone for that matter, be allowed to feed copyrighted works into today’s phenomenally powerful machine learning algorithms?

Existing “fair use” clauses in many countries, notably the US, permit some uses of copyrighted material for training algorithms, but this permission is not clear cut.

Such exceptions predate the current reality of Generative AI’s capability, which brings them into conflict with artists’ rights. When artists’ creative work is being used to train AI systems that then compete with them in the same creative marketplace, a strong argument can be made that this cannot constitute “fair” use.

Rightsholder advocates are fighting this case convincingly. The outcome remains far from clear though, and could yet upend the practices of leading Generative AI companies like OpenAI and Stability AI.

Another prominent debate concerns whether people should have the right to know if creative work is AI generated, much as we expect to know what harmful substances are in products, or whether clothes were made in a sweatshop.

For example, the Recording Academy, which runs the Grammy Awards, declared that a song must be largely composed by a human to be eligible for an award. This presents the profound challenge of agreeing on where the boundary lies between human and AI contributions to a work, when AI processes might be found at thousands of points in a creative workflow.

Simultaneously, the commercial creative technology software industry continues to grow as a significant economic sector, spawning new local industry representative bodies, such as Music Technology UK.

Governments face the conflicting demands of supporting artists and this attractive, economically buoyant (potentially bubbly) start-up and innovation sector. But despite some very real battle lines, the reality is a more complex jungle of jostling actors.

Many Generative AI technologists, artistically oriented themselves, take care not to pit their work “against artists”.

Pro-artist stance

AI music start-up veteran Ed Newton-Rex recently quit his job at Stability Audio (the audio arm of the hugely successful Stability AI, makers of the popular Stable Diffusion image generation tool) in protest at the company’s belief it was justified in scraping copyrighted artistic works for use as training data.

Meanwhile, a large proportion of competing AI companies in this area, either on moral grounds or to minimise risk, are on board with a pro-artist stance, for example by signing up to the Human Artistry Campaign, which advocates on behalf of artist groups for favourable policies.

They either use “safe” training data — which could mean it is copyright free, formally licensed, or their own in-house content — or they employ entirely different AI techniques which don’t directly target the use of original content for training machine learning systems.

The most ground-breaking machine learning techniques have achieved their impressive capability by learning from as much cultural content as they can possibly lay their hands on. The more content, the better the generative results.

But this is not to say that powerful generative systems can’t work with more constrained data sets, or even altogether different paradigms from learning from data. This includes rule-based systems and the use of human feedback (e.g., listener ratings) to improve generation.

It doesn’t mean creative AI businesses are not having detrimental effects on creators. It will be important for the creative industries to look beyond the current debates around licensing training data, assuming powerful generative techniques will emerge that don’t infringe copyright.

This means two things: everyone everywhere all at once making more creative work; and the growth of commercial business models aiming to monetise that creative production, which they may do through a range of means from simply selling creative software, to claiming copyright themselves, to positioning themselves in the distribution process, to Meta and Google-style data extractive practices.

Technologists who are “for artists” in so far as they respect their copyright in training ML models, may still be very active in changing the commercial landscape.

With that in mind, the incumbent creative industries major players, the Disneys, Warner Musics and BBCs of the world, are especially important because they encapsulate both copyright and tech innovation interests under the same roof.

They may be the clear winners, having access to their own troves of creative content data, alongside well-funded R&D arms and an appetite for new markets.

An example is Getty Images’ AI image generator, trained on Getty’s own stock and promising to be commercially safe, with “no intellectual property or name and likeness concerns, no training data concerns”. This adds further to the complex moral landscape.

The upheaval will cut deeper still into long-standing conceptions of artists’ moral rights, and the practicalities of defending those rights.

Tech breakthroughs

There is also the potential of specific technology breakthroughs that could further redirect the course of change in 2024. Much of the work will be in the pipelines, workflows and user interfaces that bring generative AI into different industries, especially with respect to collaboration and human-computer co-creativity.

Digital creative workers will respond best to technologies that can neatly slot into existing ways of working, and this will be a key design goal for technologists.

With the ascent of powerful text-based generation tools at the heart of the Generative AI revolution, we can expect natural language itself to become a more prevalent interaction modality at multiple points in a workflow.

With AI also enabling us to take apart creative content (de-mixing music tracks or extracting gestures from an actor), media assets also take on a new flexibility: the file format or encoding doesn’t matter as much if you can still manipulate the content.

But a bigger prospect still is the scenario in which someone successfully taps into audience feedback to continually modify Generative AI models.

Imagine, for example, an AI music generator that can generate tracks and directly deploy them to a large audience (for example, if they were incorporated into a major streaming service in a popular “relaxation” stream). With enough listeners’ feedback, which might just consist of whether they skipped the track or not, and a good enough algorithm, such a system could begin to take on emerging properties that, for the first time, break out of the orbit of human production and consumption.

Many are watching OpenAI with increasing anxiety as they experiment with this “reinforcement learning from human feedback”. It is yet to match the power of traditional deep learning techniques, but has the potential to transform what role generative machine processes play in cultures.

This hopefully portrays the current complexity of the Generative AI world. Those positioning themselves to be disruptors are just as vulnerable to being outcompeted or having their business models made redundant by changes elsewhere in the chain. Many a line of code and many a neural network will be created in vain.

Based on current trends, there can be little doubt that Generative AI will continue to advance and astound in 2024.

For some, we’re on a path of continued acceleration. Others take a more punctuated, episodic view of change, where the rate of advance may flatten, but few are claiming things will level out in the coming year.

Even without more jaw-dropping breakthroughs, the consolidation of infrastructure, interface design, integration and upskilling will continue to drive the growth and power of generative AI.

Source: UNSW