fastfedora

Trevor Lohrbeer fastfedora

AI safety researcher/engineer. Previously worked as an entrepreneur/full-stack engineer building commercial products across a range of languages & paradigms.

19 followers · 2 following

Achievements

x2 x3

Achievements

x2 x3

Highlights

Pinned Loading

dataset_foundry dataset_foundry Public

A toolkit for building validated datasets. Uses the concept of data pipelines to load, generate and validate datasets, especially for those used in AI safety evaluations.

Python 1
single_file_backdoors single_file_backdoors Public

Evaluates how AI models might inject backdoors when refactoring single files and how to detect and defend against such insertions.

Python 1 1
full-repo-refactor full-repo-refactor Public

A Control Arena setting for evaluating agents inserting backdoors while refactoring full repos.

Python
full_repo_datasets full_repo_datasets Public

Contains datasets of full repos for doing AI safety research, along with Dataset Foundry pipelines for generating repos and datasets.

Python
llm-action-evals llm-action-evals Public

A framework that allows non-programmers to build AI safety evals of LLMs taking actions in the real world via function-calling.

Python
ai-digest-demo ai-digest-demo Public

Demo of targeted AI persuasion by building a profile of user using Facebook Likes. Developed as an 8-hour take-home test for AI Digest.

TypeScript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trevor Lohrbeer fastfedora

Achievements

Achievements

Highlights

Block or report fastfedora

Pinned Loading

Uh oh!