~

7/8

I’m currently extremely clocked out on the last Friday afternoon before break. I still have an exam and a few research meetings/presentations next week, but I’m quite brainfogged right now so I decided it was a good moment to vomit out what’s been on my mind recently. Consider this post a compilation of “scattered thoughts that I wanted to write more about but never got around to” disguised as an end of semester reflection.

Research

This semester, most of my time was spent working on two research projects: one in empirical interpretability of language models and one in pure applied math. They have completely different characters and demand unique attention and I’m not too sure which one I enjoy better. However, I’ve been noticing one theme that keeps coming up in both projects that also came up in my internship this summer - communication clarity.

In research, where progress seems to be extremely nonlinear and not monotonic, being able to clearly communicate one’s progress and ongoing ideas seems to be of utmost importance. Skills like being able to distill lack of progress down into key sticking points and developing a knack for classifying which details to share and temporarily ignore are invaluable. From experience, realizing when to keep a half-baked thought unsaid versus sharing it can be the difference between a big insight or the discussion devolving into a mess.

Visualizations

"A picture says a thousand words" - probably an experienced researcher
I never truly understood this saying until attempting to explain my mental image of a figure I forgot to save and accidentally overwrote in my Jupyter notebook.
My applied math advisor, Govind Menon, is a dynamicist. He always says to me that dynamicists live and die (overdramatized for effect) by the pictures. He pushes me to always think about the visual intuition of important definitions and theorems. "I don't need to remember theorem statements," he says. "I just remember the picture."

One big component, maybe the most important component, of clear communication and presentation seems to be visualizations. Figure 1s, as they’re called. Consistent feedback I’ve received after presenting work (at standups at my summer internship, lab meetings, etc.) is that I should include more/rely more heavily on diagrams and figures.

I think the ML community does a good job with visualizations and figure 1s. Here’s an arbitrary one from a recent cool paper I read that introduced a new training framework for transformers to “do” chain of thought in their internal hidden space:

nice_figure

The figure is simple and not flashy but clearly outlines the key difference between their training method and standard chain of thought. Good pretty picture.

Maybe beneath this section lies a hidden argument for why multimodal learning should be the new data efficiency paradigm. Maybe there also lies an argument for why more UIs like Neuronpedia are needed for interpretability research. Regardless, good visualizations are one of my main improvement goals for my ongoing projects.

Other research notes

I might be a beta, but lowkey sometimes being told exactly what to do and it just working is nice - being a code monkey is sometimes nice. I wonder if that’s what research will be like in the future when AIs develop taste and intuition.

However, unfortunately, it is also true that the thrill of discovery, good ideas, and understanding seem to be necessarily preceded with painful exploration and uncertainty.

Rant-y Comment on the Humanities

One goal I had this semester was to try attending more humanities-oriented talks in search of new perspectives, especially on societal issues surrounding AI and philosophy. Unfortunately, I’ve left somewhat perplexed and not sure what to think.

Simply put, everything feels big. Big words, big ideas, big talk. “Discussions” play like a contest of ego shrouded in obscurity - whoever can abuse the most terminology or quote the most famous authors to sound the smartest and minimize the log likelihood on their sentence wins. From an outsider perspective, conversations sometimes are like I’m watching a live action of “The Emperor has no Clothes”: despite the flourish-y vocabulary, I somehow get the impression that very little content or understanding is being communicated and shared.

I promise I’m not just dunking on the humanities in bad faith. I really hope I’m very wrong and appropriating from a background that is too inexperienced and naive. I admit I’m probably not the target audience for the talks. But it’s been quite frustrating to find talks that I am a target audience for - the few talks I do understand feel too “obvious” and the ones I can’t understand feel insurmountably gatekept atop a pile of insubstantial concepts like “negentropy” and “symbolic systems.” It also appears that each person has their own interpretation of the concepts (which is possible because of their hazy meanings). As time passes, each concept converges to a state of superposition between all the various interpretations, resulting in the term’s meaning being derived strictly from essential context that I lack and I claim is unfairly assumed to be common knowledge.

Enough ranting - this is probably just skill diff and I’ve still had positive experiences. For example, by attending a few policy talks and reading literature, I’ve definitely grasped a better picture of the international and domestic dynamics between the big players in the AI space and existing regulation. I wouldn’t say they’ve been the most insightful or exciting talks that I’ve ever attended (nor do I expect them to be) but I still leave better informed with solid takeaways.

Life stuff

As graduation inches closer, I’ve noticed that my circle has naturally shrunk. Walking around campus, I still know people. But outside of maybe ~4 others, I really don’t know anyone else. I would have trouble believing that a “strong” version of Dunbar’s number would be over 10.

I’ve wondered whether the shrinkage is a result of me trying less to meet new people and not putting in enough effort into my relationships. The progression over the years has felt parallel to me slowing down in swim and sucking more at games - it’s a slow descent that stems from shifting priorities and truthfully a lack of effort.

Oddly, I don’t wish for more close relationships. I don’t even think it would be possible for me to maintain many more. I’m very ambivalent, but I wonder if I shouldn’t be.

Social Media

I picked up using Twitter/X this semester. Right now I’m just using a burner account to stalk and follow a ton of ML researchers, originally with the intention of keeping up with papers, trends, and news from big labs and conferences. It’s worked amazingly well at that plus more - I’ve also noticed that lots of researchers and professors at top labs and universities even recruit researchers/PhD students on the platform.

Unrelatedly, I’ve become more bullish on the effectiveness of dating apps if used correctly. I haven’t used them personally, but three people close to me made Hinge accounts this semester and within their first few ($<5$ I believe) matches have all found long term relationships. Yes, $n=3$, but that doesn’t stop me from extrapolating :P.

Cooking

I finally got off meal plan this semester and have enjoyed cooking for myself and my friends. One hypothesis for why I find it so fun is because I see a lot of parallels between cooking and research/problem solving. Here are just a few off the top of my head:

  1. You have access to limited and constrained compute/resources/past techniques (what’s in the fridge) and have the option of following existing literature (recipes) or paving a new path (the “f it we ball” approach).
  2. There is definitely a notion of strong “intuition” in both developed through experience.
  3. Innovations could come from new insight (adding more ingredients) or removing redundancy (removing ingredients). New work could also come from coming up with more efficient ways to do an existing procedure.
  4. Approximations (substitutes/workarounds) are bountiful and quintessential.
  5. Bad research and bad cooking are quite similar. More often than not, it seems to come from unproductively increasing the complexity of the system. Good work is simple work with clean results, just like how good cooking is simple cooking with clean flavors.

This connection/parallel deserves a whole post that I will force myself to write someday.