Science and Nature have each adapted their editorial policies in response to LLMs like Chat GPT. Their approaches are interestingly different.

Science says:

Artificial intelligence (AI) policy: Text generated from AI, machine learning, or similar algorithmic tools cannot be used in papers published in Science journals, nor can the accompanying figures, images, or graphics be the products of such tools, without explicit permission from the editors. In addition, an AI program cannot be an author of a Science journal paper. A violation of this policy constitutes scientific misconduct

Science editorial policy on AI

Nature says:

Large Language Models (LLMs), such as ChatGPT, do not currently satisfy our authorship criteria. Notably an attribution of authorship carries with it accountability for the work, which cannot be effectively applied to LLMs. Use of an LLM should be properly documented in the Methods section (and if a Methods section is not available, in a suitable alternative part) of the manuscript.

Nature editorial policy on LLM

Both journals agree that an AI can’t be an author. That’s tough if you are an artificial intelligence, but mixed news for humans who have, according to a Science editorial, “wonderful computer[s] in our heads”. I don’t know what I’ve got in my head (rocks, Kennedy, I can hear my old head of 6th form growl) but it’s not a computer (which is a fancy kind of rock or mineral).

Long term, AI will be used for all sorts of tasks, including, but not limited to doing literature searches, summarising and digesting papers papers, analyzing data, writing up the results, discussing them, forming conclusions, developing new ideas, designing new experiments, testing them, lathering, rinsing and repeating. Pieces of this already exist and are part of everyday working processes for some scientists. I’m not going to guess what that looks like all joined up-1, but I can’t imagine it will fit with traditional notions of authorship and papers. Science and Nature don’t feature heavily in a future with no papers and with no one hunting the kudos they bring, so it’s nice of Nature to try and document their own obsolescence.

We shouldn’t look to journals to learn how AI will change science as their whole existence is predicated on the paper as the fundamental unit of knowledge0. Their immediate concern – for which these rules are a band aid – is working out how LLMs will affect the process of generating papers. The wider question and the more important one is how to epistomologise1 with AIs and how to avoid being swamped by the tsunami of crap that AI enables. There are probably others, but worrying about how AI will affect a practice that’s already on its way out2 is only a major concern if you think the current system is great and doesn’t create the exact conditions in which someone would consider using an AI to write papers for them. The status quo is already something of a shambles that seems to have some working parts despite itself and it will probably always be a bit of a shambles no matter how we seek to improve it, but we should at least try.

There are already journals full of paper mimics – texts that look like papers but contain nothing of use. This paper was widely ridiculed for using capital-Ts in place of error bars in one of its figures, but if one dips into other papers in the same special issue, it’s evident that the ratio of actual papers to mimics is low, a ratio that my limited sampling of the whole journal suggests is fairly standard (the publisher is a bit of a tell here). The T-error-bar paper has been retracted, presumably in response to the attention it gathered, but the rest remain, with many being a confection of copy-pasted text book science, uncaptioned figures, nursery-level schematics, non-sequiturs and general weirdness. This is just one journal of the thousands that are out there. If these papers are being used as CV-filler, or serving some other nefarious purpose, then AI could at least make them readable.

AIs are built into much of the software we use, so are already unavoidable. Not all of it is yet as sophisticated as something like ChatGPT, but widely used tools like Word and Grammarly can already finish sentences for you, or rewrite whole texts to be more fluent, to sound more confident or more professional. This is not a new phenomena either, it’s just widely available now and cheap. LLMs can be used in lots of ways short of creating whole texts out of nothing, so it seems inevitable that they will be incorporated into writing tools in lots of different ways which are hard to guess right now. AI more broadly have an infinite number of applications.

I’ve played around with LLMs (ChatGPT and, very briefly, Galactica), asking them to do the following tasks which I would find useful.

  1. Turn my garbled notes into coherent paragraphs
  2. Rewrite my already written semi-coherent, comma-soaked, parenthetically-baggy paragraphs
  3. Shorten a rambling3 text.
  4. Pull out key points from a text.
  5. Summarise a text4
  6. Generate introductory material on a topic.
  7. Generate a first draft based on some key facts5
  8. Rewriting my text as an 80s power ballad.

The list goes on and on and on. At the moment, the results are barely worth the effort. When asked to tell me what the key points from a paragraph were, it gave me an itemised list of all the sentences in the paragraph. It did an OK job of turning notes into text, but its interpolations were rather more creative than I was comfortable with. Introductory material was a mix of sensible stuff and confident bafflegab. The AIs need close supervision for all these tasks (except for the power ballad one), but they’ll get better fast. That doesn’t mean they won’t need supervision though. It rather depends on the use.

Various people have pointed to clear failures of logic by LLMs, but it’s not like computers can’t understand logic, it’s just that logic doesn’t emerge in the statistical models on which current LLMs are based. There’s no reason why they can’t be added separately. I would also point out that many scientists are not great at logic either, despite how they are often portrayed in popular culture.

Banning the use of AI to help in paper writing seems like a losing choice. I’m interested to see what creative uses for AI people will come up with even within the narrow topic of writing assistance. I’m also interested to see what creative new problems this will cause. Current systems happily reproduce hideous biases that are present in their training data. They get patched to hide this fact – usually giving a boilerplate response to obviously contentious questions – but its not difficult to cajole them into saying exactly the same things with a reformulation of the question, or sneaking it out in other answers. While we’d hope people would do better, the evidence is that they don’t. The evidence is there in the training sets of the LLMs.

Time will quickly erode policies banning or restricting usage of AIs anyway. Students will have grown up using these tools and what to an older person might seem an affront to all that is good and right in this world, to the student, they might seem an unobjectionable, perfectly natural, and eternal part of the scientist’s toolkit. In time, those students will be in charge and might wonder why such interdictions were ever needed.

While much of the above might seem like an argument that people are crappy so why not use crappy machines10, the central point is that scientists will find ways of using these new tools to help them advance the frontiers of knowledge9 and, hopefully, by extension, help improve society8. Not using them at all, or using them sneakily, aren’t long-term solutions. That they might be misused is obvious, but that needs to be part of the discussion too. Some problems are obvious up front, some take time to surface, some arise after long familiarity. One has to be alert to all these possibilities. That discussion obviously needs to be embedded in the wider discussion about how to adapt and improve scientific processes and practices and how they connect to everything else.


-1 I close my eyes and see AIs creating streams of structured information that other AIs can consume, analyse, modify and add to, linked to laboratory facilities that are completely mechanised. If asked nicely by a person, an AI may deign to try and explain what they know in whichever human language the asker requested, using pictures, words, music (and whatever other sensations might be readily synthesised) as needed. I can’t imagine this would always be possible. There’s a limit to what a human can comprehend or process fully, something which the AI would probably know. What it would do in that situation is anyone’s guess but it probably wouldn’t say “sorry you’re too thick to understand”. Not more than once anyway.

0 I’m pretty sure they’ll use them to “help7” proof readers if they’re not already.

1 Is that a word? Incidentally, I asked an LLM to make up five new words. It gave me three words that already exist and two that aren’t in dictionaries but are googleable.

2 Like one of those guests at a party who can’t take the hint and go home, draining one’s reserves of patience, civility, and brandy.

3 I ramble. Not everything is as succinct as my blog posts.

4 Years ago, Science (I think) published a paper which described a system for summarising scientific papers. They used the system to write the abstract for their own paper. I can’t find the paper now, which is a shame, but I did find an excerpt from 1994 with a letter referring an earlier paper about automatically summarising texts, noting that the authors had missed a tremendous opportunity by not using their method to summarise their paper.

5 In the early 2000s, after having read and written one monitoring report too many, I realised that a lot of what goes into them can be summarised automatically. I wrote a script that took in maps and time series of climate data, performed standard analyses, identified areas and features of interest, amalgamated these regionally and wrote a summary. The summary writer had a set of pre-coded sentences with different ways of wording commonly used fragments of text. It would chose these at random to make the output feel a little less robotic6. What came out at the end was never going to win any prizes for elegance, but it saved me time every month in composing reports. It didn’t just replace a tedious task though; it helped me do a better job. One time it highlighted areas that I had overlooked because the default plotting code used contours that made some features vanish. That led to improvements in the choice of plotting parameters, as well as flagging up some interesting events. The time saved was used to think more carefully about what “significant” meant for data with a large component of measurement error. It was also more “objective” than I was and gave appropriate weight to areas of unusually high and low temperature, whereas my tendency was to balance the two. The whole process of automation was worthwhile as it forced me to think about what the essential task was and how that could be broken down, as well as making a process that was reproducible, traceable, defensible etc.

6 It was still prone to saying things like “It was significantly warmer than average in…” and then spit out a longlong list of countries in Europe rather than just saying “much of Europe”. I took this as a sign that robots might want to sound like robots, just as humans want to sound like humans.

7 Replace

8 In so far as one believes that’s how it’s supposed to or intended to work.

9 Or to reverse elegantly out of which ever cul-de-sac the frontier of knowledge advanced confidently into last week.

10 Looked at from a certain point of view, science is a set of behaviours, norms, heuristics and processes by which the crappinesses that we are prone to – pettiness, ego, stupidity, arrogance, thoughtlessness – can be tricked into occasionally doing something of use.