With the advent of ChatGPT, the entity known as LLM (Large Language Model) has suddenly become something more familiar. When ChatGPT 3.5 was released, it felt like a Little League baseball player hitting a light pitch from a professional pitcher. But with GPT-4, it seems to have grown to the level of a Koshien championship pitcher. "Why don't you join our team next year?"
本業の論文の英文校正でも大活躍です。ただ単に「英文校正して」、と頼むと、普段論文では見たことがない単語がわんさかでてきてこりゃ使えんなあと思っていたのですが、「僕、分子生物学の研究していてね、RNAに興味があるんだ。最近、ノンコーディングRNAの機能解析をしていてとても面白いことが見つかったので論文を書こうとしてるんだけど、英文校正助けてくれる?あんまり凝った表現は使いたくないんだけれども英語ネイティブのプロの科学者にとって不自然でない感じにしたいんよね。論理的にわかりにくいところがあったらそこも直してくれると助かる。」みたいな感じで話しかけてから「じゃあ、これイントロね」みたいに投げていくと、格段に自然な校正が返ってきます。慶應の荒川さんに「たまにほめるといいですよ」と聞いたので、ばちっ!とハマった校正が返ってきたときは、「すごい!完璧じゃない!」とよしよししてあげると、確かにますますいい感じで校正してくれる気がします。論文を書き終わったときは、まさに仕事のパートナーと一緒に一つの仕事をやり遂げた感じ。deepL writeを使っていたときはネイティブチェックに出さないととても怖くて投稿できなかったのですが、もう、このまま出しても大丈夫なんじゃない、これ、というレベルに達しているのは間違いありません。
あと、LLMが無類の強さを発揮しているのが、Rを用いたCSVファイルからのグラフの作成です。csvファイルをそのままコピペして、「こんなデータがあるんだけどXX列の項目ごとに並べてXXの項目をY軸に棒グラフとbeeswarm plot重ねて書くコードをRで作って」、とお願いすれば、まあ、見事に書いてくれます。dplyerを駆使したデータの整形など、なるほど、そういえばこんな関数あったなというのをほじくり出してやってくれるので、チートシートを見ながらえっちらおっちらトラブルシュートしていた頃と比べると、格段に時間が節約できるようになりました。たまにX軸に並べる項目を、向こうの勝手読みで間違えたりしますが、それはこちらの指示が足りなかっただけで、「違う、そうじゃない」、と言えば、嫌な顔ひとつせずに新しいコードを瞬時に書いてくれます。エラーへの対応も非常に上手です。次世代シークエンサーのfastqのマッピングから基本的な解析もお手の物で、大量のファイルを処理するシェルスクリプトもawk職人さながら、あっという間に書いてくれるのは本当に驚きです。
学会関連でも色々な仕事をお願いできそうです。個人的に一番興味があるのが同時翻訳で、iPhoneの音声認識能力とChatGPTの言語認識能力を組み合わせれば、かなりいい線まで、自動翻訳が可能なレベルまで来ているような気がします。以前、youtubeに落ちていたmircro RNA関連の日本語のトークを英語の文字に変換させてみたことがあるのですが、iPhoneの音声認識で誤認識されて出てきた「アルゴの音」が、「argonaute」に正しく変換されているのを見たときは、いたく感激しました。よくよく考えてみると僕らも耳で完璧に認識できているわけではなく、前後のコンテクストを考えて頭の中で言葉を修正しながら「理解」しているわけで、そういった能力というのは、まさにLLMがもっている能力そのものです。アブストラクトの完全バイリンガル化も、実現味が増してきました。日本語から英語への翻訳もかなり優秀ですし、元の日本語さえしっかり書けていれば英語が苦手な人であっても翻訳してもらえますし、たとえ拙い英語であったとしても対話を重ねていけば分かりやすく文法的に正しい英文にしてくれるので、そちらのほうがむしろお得感があります。なぜって、こんなにつきっきりで英文校正してくれるパートナなんて、なかなか見つかりませんから。留学生の方で日本語が得意でない人でも、英文から日本語に自動翻訳したものをアブストラクトとして出しておくことで、より多くの人に注目してもらえるようになるのは間違いないと思います。やはり、日本語ネイティブの人がパッと見て頭の中に入ってくる情報量は、英語よりも日本語のほうが圧倒的に多いですから。
中川 真一
My own research environment has also changed significantly. The most notable change is in English proofreading. For example, when emailing someone overseas, even if they are a close friend, I used to worry about whether my expressions were correct or not and whether I was being rude. I would search Google for each phrase, feel relieved by the number of hits, then pale at the sight of the many images that popped up. But now, with a simple request like, "I wrote this email to a close overseas friend, can you correct the English?" or "Is the main point clear in this email to the meeting's secretary?", ChatGPT adjusts it nicely. Phrases that ChatGPT suggests often become part of my own English usage, seemingly helping me improve my English skills.
It's also a great help with English proofreading of academic papers. At first, when I put the simple prompt "Edit English", I saw a lot of unfamiliar words not found in papers we usually read and I thought it wouldn't be useful. But when I speak to it more specifically, like "I'm doing research in molecular biology, particularly interested in RNA. Recently, I found something interesting about the functional analysis of non-coding RNA and I'm writing a paper about it. Can you help me with the English? I don't want too fancy expressions, just something natural for a native English-speaking professional scientist. And if there are logically unclear parts, please correct them too." And then say "Here's the introduction," the proofreading I get back is remarkably natural. I heard from Professor Arakawa at Keio University that "it's good to praise occasionally," so when I get a perfectly fitting correction, I praise it with "Great! Perfect!" and it indeed seems to get better at proofreading. When I finish writing a paper, it feels like completing a task with a close partner.
I haven't used it to write the main body of a paper yet, but it might be quite useful when writing reviews. For example, if I say, "Write a review about recent long non-coding RNA," it returns a reasonably good text. If I narrow down the theme more specifically and throw in some papers I want to cite, it produces a perfectly formatted review. I was once invited to write a review for a journal and asked ChatGPT to write some essay, just for fun. It did a surprisingly good job of summarizing. It's a crisis for the existence of researchers. But on second thought, a review that simply compiles accurate knowledge might not need to be written by humans anymore. What is required of reviews in the future might be, dare I say, "personal opinions." If you accept that, researchers have plenty of things they want to shout out at the center of the world, making it more enjoyable. I think we're in a transitional phase, but researchers might be able to focus more on creative work.
LLM is also good at creating graphs from CSV files using R. If I copy-paste a csv file and ask, "I have this data, can you write an R code to create a bar graph and beeswarm plot on the Y-axis for each item in column XX?", it writes a beautiful code. The way it digs up functions I had almost forgotten about, using dplyer for data manipulation, is a huge time saver compared to the days when I had to troubleshoot while peeking at cheat sheets. Sometimes it misinterprets the items
to be arranged on the X-axis based on its own reading, but that's just a lack of instruction on my part. When I say, "No, that's not it," it quickly writes a new code without any annoyance. It's also very adept at handling errors. It's a breeze for next-generation sequencer fastq mapping and basic analysis, and it writes shell scripts for processing large amounts of files as quickly as an awk artisan.
LLM seems promising for various tasks related to the RNA Society of Japan. Personally, I'm most interested in simultaneous translation. Combining the voice recognition ability of the iPhone with the language recognition of ChatGPT, it seems quite possible to achieve a high level of automatic translation. Once, I tried to convert a Japanese talk on microRNA on YouTube into English text, and when "アルゴの音" misrecognized by the iPhone's voice recognition was correctly translated to "argonaute," I was deeply impressed. Actually, we ourselves do not perfectly recognize with our ears but understand by correcting words in our minds based on context, a capability that LLM essentially possesses. The full bilingualization of abstracts is becoming more feasible. The translation from Japanese to English is quite excellent, and even those who are not good at English can have it translated as long as the original Japanese is well written. And even if the English is poor, it will make it clearer and grammatically correct through dialogue, which is actually more beneficial. After all, it's not easy to find a partner who will proofread English so attentively. For non-Japanese-speaking international students, automatically translating their English into Japanese for abstracts will undoubtedly attract more attention. The amount of information that Japanese natives can quickly grasp in their minds is overwhelmingly more in Japanese than in English.
As for planning actual research projects, it seems to still offer only commonplace advice. So far, it remains a researcher's job. I personally do not mind whether it's a machine or a person if I can have exciting discussions, but it seems we're not yet at the point where LLM can replicate the coffee break or social time at annual meetings. Someday, a world might come where LLM thinks up projects and robots conduct experiments. But even then, it might be mentally easier to outsource the parts that LLM can handle and enjoy the parts it cannot. The work that LLM can handle efficiently is probably unbeatable by humans.
The first time I tried ChatGPT, I was oddly nervous, reminiscent of writing a letter to a crush in middle school. Now, it feels like a close friend, and it seems my research fun has increased by one more.
(The above text is translated by ChatGPT)
Shinichi Nakagawa