.arunim.fyi

❯

❯

paper

Jan 03, 20251 min read

Papers I’ve at least partially read, or planned to read and taken notes on or commented on.

20 items with this tag.

Dec 17, 2025
mask
- ai/eval
Jul 20, 2025
Patient-Specific In Vivo Gene Editing to Treat a Rare Genetic Disease
- bio
Jun 15, 2025
Utility Engineering
- ai/alignment
Apr 26, 2025
Existential risk narratives about AI do not distract from its immediate harms
- ai
Apr 26, 2025
Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
- ai
Apr 23, 2025
IDs for AI Systems
- ai
Mar 26, 2025
Eliciting Language Model Behaviors with Investigator Agents
- ai/adv
Mar 26, 2025
Red Teaming Language Models with Language Models
- ai/adv
Mar 26, 2025
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
- ai/eval
Mar 26, 2025
Trading Inference-Time Compute for Adversarial Robustness
- ai/adv
Mar 26, 2025
gbrt
- ai/adv
Mar 26, 2025
gcg
- ai/adv
Mar 26, 2025
planetarium benchmark
- ai/eval
Mar 26, 2025
political persuasion and LLMs
- ai
Mar 26, 2025
tinyBenchmarks: evaluating LLMs with fewer examples
- ai/eval
Mar 19, 2025
Idiosyncrasies in Large Language Models
- ai
Mar 13, 2025
Transformers need glasses! Information over-squashing in language tasks
- ai
Jan 09, 2025
superhuman performance of a large language model on the reasoning tasks of a physician
- ai
Jan 08, 2025
large language model influence on diagnostic reasoning
- ai
Jan 03, 2025
membership inference attacks and training data proofs
- ai

Created with Quartz v4.3.1 © 2025