Get Involved

We welcome contributors of all backgrounds and experience levels! Joining EleutherAI is as simple as joining us on Discord and picking a project to contribute to. However we have some areas where we are particularly interested in recruiting experienced collaborators:

  • GPT-Neo is looking for people with expertise in ML pipelining and developing ML apps with user-friendly UIs.

  • The Pile is looking for people with access to high-quality data or the ability to scrape, clean, and process large-scale data sources.

  • The Radiation Lab is looking for people who are experienced with working with fine-tuning and working with the loss landscapes of generative models.

As an independent organization, we are dependent upon donations for our computing costs. We are always interested in speaking with people who can donate compute times.

Stuff We Need Help With

People keep asking how they can help, so this document is intended to help people figure out what they can contribute to. The #1 thing that we need is people who are competent at implementing deep learning systems. While there is some work for people without that skill set, we are a DL research lab and therefore most of the work involves writing and using neural networks. What follows is organized by skillset rather than by project.

If you have any questions, the #research channel on Discord is the best place to learn about our projects. Discord users Sid, BMK, Stella Biderman, and Connor (Daj) are generally involved with organizing things and can point you towards specific resources or answer questions.

Data Processing

Data processing work involves searching the internet for data, scraping and cleaning data, and writing scripts to load datasets. Our current data processing need involves building an evaluation harness. If you’re interested in helping out, check out the open Issues in the repo or stop by #lm_thunderdome on Discord. Other projects may have individual data processing needs.

Software Development

Software development work involves creating support systems, building software systems, and cleaning up code and documentation.

Deep Learning

We always need more SWEs who are skilled with deep learning to help with our projects. Our biggest need is for skilled deep learning software engineers looking to contribute to open source software or build their expertise on new projects.

The main repo we are developing right now is our open-source GPT3 model, GPT-NeoX. We try to keep the git issues as up to date as possible, so if you can familiarize yourself with the codebase and see an issue you think you can take on, go for it! Also feel free to ping @Sid on discord with any questions.

Other DL projects we’re working on include scaling experiments (#scaling-laws), VD-VAE scaling (#vd-vae), replicating alphafold2 (#alphafold), reinforcement learning (#reinforcement-learning), AI interpretability (#interpretability), and tracing data through training (#the-rad-lab). Please ask in the relevant discord channels if you think you can help out with any of the above.

Unfortunately we do not have many projects that are accessible to people who are beginners at deep learning. We welcome you to hang out in our Discord and learn, and may have jobs you can help with from time to time, but we have many more beginners than beginner-friendly tasks.

UI/UX

We have sporadic need for UI/UX work. We are currently training GPT-3-style models for public release and would be very interested in assistance from anyone who could develop a web application / UI for it. Post in #website on Discord if you can help.

Web Development and Graphic Design

We sporadically need help with various tasks related to managing our intnet presence. Right now our website has the basic functionality we need, but there’s always room for improvement. Post in #website on Discord if you can help.

Science

Biology

We are working to implement and publish AlphaFold2. Reach out to Lucidrains on Discord or post in the #alphafold channel if you can help.

Linguistics

We do a lot of NLP research, and are always interested in more linguists contributing. Currently our primary interest is in how scaling laws generalize across languages, but we are also interested in pitches from linguists about topics they are interested in. Post in the #scaling-laws channel if you can help.

Mathematics

We have a need for theoretical and applied mathematicians to assist with various tasks, predominantly but not exclusively surrounding implementing group equivariant neural networks and doing topological data analysis. We are interested in expanding the more mathematically sophisticated research we do, but we are strongly limited by the number of mathematicians we have. Reach out to Stella Biderman on Discord if you can help.