The Mathematics Search Engine
Mathematics News & Resources
4Mathematics is a specialist search engine for Mathematics. Discover the latest math news and mathematical content. Part of the 4SEARCH network of topic specific search engines.
Latest News & Web Pages
Bootstrap confidence intervals for your LLM eval metrics
7+ min ago (601+ words) TL; DR: A single eval number hides its own uncertainty. Eval confidence intervals from bootstrap resampling turn a point estimate like 84. 2% accuracy into a range, so you stop shipping models on a difference that is noise. I lead the fine-tuning…...
# Unit of Work: Managing Database Transactions Like a Pro with Python
25+ min ago (1644+ words) Introduction Every serious backend developer eventually faces the same problem: you need to make multiple changes to a database as part of a single business operation, and you need all of them to succeed or none of them to go…...
Python List Comprehensions: Read Them in 3 Steps Without Getting Lost "
1+ hour, 11+ min ago (252+ words) List comprehensions confuse beginners because they read backwards from how most people think about loops. Here is a three-step method that works for every list comprehension you will ever see, including the nested ones that make experienced developers slow down....
Channels-last memory format cut our conv backbone latency 22%
1+ hour, 8+ min ago (252+ words) TL; DR: Switching our convolutional segmentation backbone to Py Torch's channels-last memory format cut inference latency by about 22% on A100s, with no accuracy change and a four-line code edit. The channels-last memory format stores a 4 D activation tensor in NHWC byte…...
The Local AI Assistant Trap: Why Running Your Own Costs More Than You Think
1+ hour, 26+ min ago (800+ words) The notification hit my phone at 2: 47am. A dependency version conflict had bricked the local LLM setup I'd spent two weeks configuring. The model wouldn't load, the context window kept crashing, and my "personal AI assistant" was now a very expensive…...
Building Naija Shield: Behavioral Fraud Detection on Nigerian Mobile Money Rails
1+ hour, 53+ min ago (588+ words) Naija Shield is the behavioral fraud detection layer we built at BAMG Studio to address this gap. This article walks through the architecture decisions, the dataset problem, and the results from our pilot deployment. Rule-based engines " velocity checks, amount thresholds,…...
Toy transformers may represent belief-state geometry optimally but not minimally " Less Wrong
2+ hour, 7+ min ago (26+ words) > Methods note: The code used for the experiments and related open-source repo were built with Claude. The experimental design and writeup is my own,...
Reasoning and learning about injected concepts in language models " Less Wrong
2+ hour, 7+ min ago (1105+ words) This work was done as a part of SPAR, under the mentorship of Mirko Bronzi and Damiano Fornasiere. " TL; DR "...
Machine learning in production: the model is the easy part
2+ hour, 34+ min ago (517+ words) A model that scores 95% on your test set feels like the finish line. Then you ship it, and you find out it was the starting line. The model was maybe 10% of the work; everything that makes it survive production is…...
How I Stopped Overpaying For AI Models (And You Can Too)
2+ hour, 43+ min ago (1011+ words) Check this out: how I Stopped Overpaying For AI Models (And You Can Too) This is the post I wish someone had written for me six months ago. Its basically my whole journey of comparing every open source AI model…...