Think Tokens

Silent Collapse of Human Distribution

The trend of converging to sameness is something that has played out before. Will it be different this time with everyone using LLMs?

Many parts of our world have become homogenized. Cities in developed parts of the world look similar to each other. For example, the different tech parks across the globe boast similar infrastructure. With the advent of the internet, we all have become connected, and our cultures have become less localized. People around the world watch the same movies, and ā€œInterstellarā€ is as much a hit in the US as it is in India. The same reels go viral across borders to make it to everyone’s feeds, making information ingestion more alike across people. But this time it feels a bit more personal, with the sameness being the generator, than an influencer. Will our creative side also start losing its uniqueness now?

Most frontier LLMs are trained on internet data, with every lab chasing rather similar goals of enabling task completion. There is some variety in terms of model behaviour, with certain models being better than others at certain tasks or one being more warm to chat with, yet the models fitting on the distribution they are trained on results in killing the entropy within the models. Generating synthetic data from different models to train the model is also just a form of sampling from the global pool of data, exacerbating the distribution skew in training.

As the steep rise of humans using such models is being seen, I wonder if we are converging on a future where the richness of the human distribution with the number of minority outliers is being lost, and the model diversity is all that’s left. In the current state of events, we see a pervasive use of coding models that has resulted in people being less connected to their code to come up with eureka moments that would otherwise spring up when one is stuck for a while with their existing code.

Recently, Karpathy gave an example that while writing the nanochat code, the models kept wanting to use the Torch DDP module, an attribute of only the training set it has seen, and a viable solution. Coming up with a better code from scratch to improve upon this solution isn't something the model was incentivised to do. This drive that might have been seen with humans is now seeing the light of extinction with an increased use of these models for creative tasks. There is a lack of entropy in the distribution of generation, with no incentives for the models to come up with a unique solution for a solved problem, as everyone is focusing on correctness. In the minority case of posing unique problems, the models that do help in solving them are providing novel solutions by construction. However, the dominant use case that is a rather generic framing (under-specified description of the problem) is seeing a convergence. Tying this with the previous piece I wrote, the correctness might be ensured, but the quality of the solution is still a missing piece in assessing, and hence in generating as well. In this interesting recent work, which I believe we need to think more about, the authors build better reward models in the multi-agent RL setup, focusing on proactivity and personalization, which is a crucial optimization for agents in a user-centered interaction.

The depth in code we generally see from a distinguished engineer vs a low-skilled engineer is distinct, which might just vanish as all start using the same bunch of models for generations. For example, the paper SWE-fficiecy recently showed models still aren’t great at coming up with optimizations for the code within repositories, as probably an expert would have achieved. These gains are what differentiate code written by various humans as well. If we take the style of code into perspective, I like writing my import statements directly, as I expect anyone to be installing the requirements/pyproject file, yet these models keep enclosing it in a try-except block, deviating from my style. If this starts becoming the norm, people (for example: developers) while achieving their goals might start losing their distribution and start overfitting on the model distribution (the new bias/variance tradeoff?). Will the future of human code distribution narrow with every passing moment we rely on these models?

This problem might be more grave in tasks where correctness itself is ill-defined. Where there is a greater need of creativity like art, the diversity in generations might be more caged for a model. A comedian cracks a joke (which matches the definition of a joke), but it humors each one differently. In this sphere as well, Karpathy had rightly pointed out that if you ask a model a joke, it repeats the same 2-3 variants of jokes. It does not understand the vast abundance in the actual human distribution and how the joke varies across cultures. It does not get the satire on topics that I would laugh at while chatting with my friends, nor will it generate those quirky remarks if it is part of our group conversations that are natural and born out of context to pull one's leg.

Adding to my observation, someone interestingly pointed out on X that ChatGPT has spoiled the fun of reading emails from people, as their writing style was reflective in that piece of writing. With the use of the models, every email has similar tones washed with perfection. Gone are the days when you actually got a sneak peek into the state of mind of the email writer by their nature of writing one-liners or being confident or polite, or hesitant.

Human race has been passing down its unique culture, be it in terms of food recipes, languages, regional beliefs etc. Are we seeing a dead-end in this regard, with the model only suggesting those five popular dishes to eat when I ask it to suggest my meal for today's dinner? With more people relying on models to choose what to cook tonight and what pick-up line to use, we are slowly unknowingly seeing an epidemic of the collapse of the rich human distribution.

I feel that in this tug-o-war between completing one’s task and adding one’s style or touch (which could have benefits beyond correctness) into the solution, it’s important for the models to improve their incorporation of agency to produce greater entropy solutions. Humans also need to keep reconnecting with their self while using the models, to prevent themselves from being lost in usage. Should all our creative tasks be overtaken by generic task-achieving models?

Thanks to Saujas Vaduguru for reading earlier drafts of this post!