Translations of this post are available in simplified Chinese and Korean.
We are excited to announce that Wolfram|Alpha is now available in simplified Chinese and Korean! This adds to our growing list of languages, including Japanese, Spanish and English, allowing us to continue to support our long-term goal of making all systematic knowledge immediately computable and accessible to everyone.
What Is Wolfram|Alpha?
Wolfram|Alpha is a computational knowledge engine that has been providing students and professionals with prompt engineering solutions for over 15 years, built upon the 35 years of research and development of Wolfram Language. Similar to the now popular LLM chatbots, Wolfram|Alpha processes natural language queries ranging from basic arithmetic to advanced calculus.
Wolfram|Alpha is based on four key components:
Natural language understanding: The search engine is designed to interpret and understand the questions asked by users in intuitive ways, ensuring no one is barred from the knowledge they seek.
Curated data and knowledge: Our information is built on more than 10 trillion pieces of data from primary sources and is continuously updated by experts.
Dynamic algorithmic computation: Once a question is interpreted, Wolfram|Alpha pulls the relevant information from 50,000+ types of algorithms and equations to summarize and generate results that are accurate and helpful.
Computed visual presentation: Wolfram|Alpha not only returns the exact answer to the user, but also provides additional information, including over 5000+ different types of visual and tabular outputs for better understanding.
Wolfram|Alpha in simplified Chinese and Korean is much more than simply a translation of the English content. These versions are an adaptation of the natural rule and result generation to simplified Chinese and Korean.
What Can You Do with Wolfram|Alpha in Simplified Chinese and Korean?
The simplified Chinese and Korean updates include each math topic that is available in the English version. From elementary math to calculus and everything in between, the wide diversity of math topics included in Wolfram|Alpha allows it to answer almost any question you might have.
We can start to explore Wolfram|Alpha by clicking the random button, visiting the large examples gallery that is organized by topic or by making a query in the search bar.
Let’s start with something simple. Need to figure out what the factors of 50 are? You’ll be provided with a tidy breakdown and the option to see step-by-step explanations with Wolfram|Alpha Pro.Комментарии (0)
This post discusses the Wolfram High School Summer Research Program. You can read about various Wolfram education programs and the Wolfram Middle School Summer Camp to learn more about these programs specifically.
As the Precollege Educational Programs Manager at Wolfram, I have the privilege of working with hundreds of bright, passionate students from middle school to college. In this post, we’ll be exploring the Wolfram High School Summer Research Program, Wolfram’s flagship program for younger students. I’ve been the Program Director since 2019, and it’s been a joy to expand and extend the Wolfram education ecosystem over that time. We have worked hard to create enrichment programs for talented students, and we now have a rich ecosystem of opportunities for students of all ages.
What Is the Wolfram High School Summer Research Program?
The Wolfram High School Summer Research Program is a project-based research opportunity for motivated high-school students. In just two and a half weeks, students are guided by expert mentors to develop a project using Wolfram Language at the intersection of modern computing and an academic interest of their choice. Students then publish a computational essay to showcase their work.
The Wolfram High School Summer Research Program is for students who love to dig deep into complex problems and think outside the box. It’s perfect for those who are drawn to learning for the sake of discovery—whether that’s through coding, research or creative problem solving. The program is especially suited for bright students who might not exactly fit in with the typical school environment but find their passion in academic exploration. No matter what STEM or STEM-adjacent subject they get excited about, they’ll find others who share their passion.
We accept between 65 and 75 students aged 14 to 17 to the program, with the occasional 13-year-old. Students come from around the world and from many different economic backgrounds. What they all have in common is the motivation to achieve difficult goals and passion for a subject (or multiple subjects).
The Wolfram High School Summer Research Program is held on the beautiful campus of Bentley University, outside of Boston, Massachusetts. Students get to experience university life, often for the first time. Students tend to enjoy the opportunity for some independence and freedom in the safety of a university campus.
Academic Experience
The Wolfram High School Summer Research Program is a primarily academic program, and our goal is to help students find crossovers between their interests, explore brand-new topic areas and complete a research project in a subject that matters to them. Going from a novice Wolfram Language coder to producing high-quality research is no trivial matter, and it doesn’t happen without hard work from students and staff.
Admission to the Wolfram High School Summer Research Program
Admission to the program is competitive and requires students to answer several short essay questions, complete a coding problem set and attend an interview. Students are not just assessed on their academic record, but also their passion and critical thinking skills. A successful Wolfram High School Summer Research Program student will be self-motivated and able to parse new information to find creative solutions to unique problems. We’re not just looking for the brightest students, but for students who are passionate about their field and who will take the best advantage of the program.
For students who may not be ready but are eager to strengthen their skills for admission the following summer, we recommend checking out Computational Adventures, a set of project-based learning experiences designed by the Wolfram High School Summer Research Program directors.
Pre-program Workshop
Many students don’t have access to high-quality computer science education, even if they’re bright and motivated. To make sure that more fantastic students have access to the Wolfram High School Summer Research Program, we offer a free pre-program workshop to students who haven’t coded before. At the workshop, we cover introductory material in computer science and Wolfram Language, letting students begin the summer on a level playing field with the others.
Learning Wolfram Language
A big part of the Wolfram High School Summer Research Program is learning to code in Wolfram Language. Wolfram Language is a symbolic language, deliberately designed with the breadth and unity needed to develop powerful programs quickly. Coding in Wolfram Language allows students to skip out on the usual learning curves that people often get stuck on and move straight to being able to do powerful things with their code.
Creative Computation
Before the summer begins, students take the interactive Wolfram U course Creative Computation. Designed by the Wolfram High School Summer Research Program directors, Creative Computation is a fast introduction to Wolfram Language and computational thinking more generally. In the project-based course, students create computational art and poetry, explore audio and build video games to gain a solid foundation in coding. Students who come to the pre-program workshop do a small subsection of the course.
Experiential Learning
When students arrive on campus, we spend the first few days extending our understanding and abilities in Wolfram Language, computer science and computational thinking. In small groups, assisted by a mentor, students work through a set of carefully curated mini-projects on a wide range of topics. Students deepen their understanding of coding concepts by solving problems like creating a facial recognition application, analyzing epidemic data, finding programmatic solutions to word problems, debugging broken code, creating computational artwork, building dynamic weather forecast applications, exploring cellular automata, manipulating neural networks, analyzing chemical compounds and planning optimal road trips.
By working through these problems with peers and mentors, students both explore new topics and get better at coding in preparation for their project.
The Project
Once students have a good grasp of Wolfram Language, we move on to the main part of the program: doing a project. Students propose potential projects as part of their application, and we have a large bank of potential projects for students to do in all subject areas. Each student meets with their mentor, the directors and Stephen Wolfram to decide on a project that will suit them best. At the end of the program, students publish their research project on Wolfram Community to a wide audience of potential reviewers. Successful students are encouraged to submit their project for assessment for Wolfram U’s Applied Expertise in Computational Research certification and may turn their project into a paper for submission to a journal.
Interdisciplinary Work
All projects at the Wolfram High School Summer Research Program are interdisciplinary, meaning that students work on the crossover of at least two of their interests. For one of our students, that meant combining his passion for sailing with his academic interest in physics to create aerodynamic simulations to find optimized paths for sailing. For another, that meant combining a deep love of chemistry with an interest in data analytics to analyze molecular cages. We take great care to combine students’ interests in new and interesting ways, which often means students create amazing projects with unusual subject combinations.
This wide scope of projects is unique to the Wolfram High School Summer Research Program. Where most programs focus on a particular branch of STEM, the Wolfram High School Summer Research Program encompasses all STEM and STEM-adjacent subjects and creates an environment where students can explore pretty much any topic. Even if it’s drone ballets, tongue-twisting poems or ancient Greek syntax. To see the rest of our students’ unique projects, take a look at the project gallery.
Examples of Projects
To understand the breadth and depth of projects at the Wolfram High School Summer Research Program, it’s worth diving into a few projects in different areas, done by students at different stages of their educational journey.
Example 1: Minimal Universal Classical and Quantum Gates
At the Wolfram High School Summer Research Program in 2024, Christopher Gilbert explored minimal universal classical and quantum gates for his project. By formulating classical operations in terms of linear algebra, he implemented a quantum computing simulator that can run both classical and quantum gates. Christopher used his simulation to find a novel minimal set of 2 2 quantum gates that may be the smallest set ever found.
Christopher was able to explore the universality problem in detail, engaging with graduate-level mathematics and computer science topics.
Example 2: Simulating the Flocking Behavior of Boids within a Parametrically Defined Vector Field
Engaging with his interest in robotics and mathematics, Ritvik Gupta’s 2024 project explored the boid simulation, a widely used model for simulating emergent flocking behavior from simple rules. Ritvik implemented algorithms to manipulate the boid simulation with vector fields, maintaining flocking behaviors across parametric paths while avoiding obstacles. Ritvik’s project has applications to computer graphics and swarm robotics, and he was able to touch on his interests in physics, biology and computer science.
Example 3: Building Nanobelts in Silico
Ananya Thota’s project combined her interests in chemistry and biomedical research to model carbon nanobelts and explore potential scientific applications. Ananya computationally generated nanomaterials and explored the potential of nanobelts to secure molecules. This work has applications in medicine development and medical procedures.
Example 4: Animating Drone Light Shows
Eleanor Chen, one of the youngest students in 2024, followed her passions for algorithm development and art to create an application to map pathways for drones to follow when transitioning between formations. Her approach focused on determining the best flight paths for drones by minimizing flight time and distance traveled as they move between sections of a performance. Her algorithm allows groups of drones to transition seamlessly between individuals and groups of images.
Mentors
Throughout the project development cycle, students are supported by expert mentors. The mentor team is made up of Wolfram employees, professors and industry experts, as well as undergraduate and graduate students. Undergraduate mentors are usually Wolfram High School Summer Research Program alumni and complete extensive training as part of the Wolfram Emerging Leaders Program.
Our mentors are experts in both Wolfram Language coding and their area of STEM or STEM-adjacent academic study. Each year, we have a slightly different set of academic areas covered by the mentor team but subjects our mentors regularly excel at include computer science, mathematics, physics, life sciences, engineering, linguistics and economics.
Mentors have three or four students in their group and help each student individually to understand academic content, plan their projects, code, write their essays and learn soft skills like research and independent work. Mentors also help students collaborate between similar projects, teach students concepts they’re less familiar with and help keep students on track to finish a great project.
Academic Extension
Although doing a project is the core feature of the Wolfram High School Summer Research Program, there’s a lot more to the program! Students engage in a rich schedule of academic and non-academic activities ranging from quantum physics to karaoke nights.
Seminars and Talks
Throughout the program, students can elect to attend lectures and small-group seminars on a range of topics. Seminars are usually led by mentors and include a short teaching period to go over an advanced topic, followed by activities or discussions. Talks are usually given by guest speakers and are more focused on exploring a subject in a lecture format.
The seminar and talk schedule is different each year, catering to the admitted students’ interests and current trends in technology. Some amazing speakers we’ve had in the past include Sam Blake, who solved a Zodiac Killer cipher; Christian Pasquel, indie hacker and computational artist; Brian Silverman, founding developer of MIT’s Scratch; and Maggie Wear, fungi researcher. Seminars have ranged from creating mathematical art, to exploring the backend of Wolfram|Alpha, to engaging with computational chemistry, to analyzing the linguistic differences between foreign languages.
Engagement with Stephen Wolfram
Stephen Wolfram, founder and CEO of Wolfram Research, is highly engaged with the summer programs. From the start, Wolfram and the directors work with each student to find a project that’s right for them, and many projects are specifically crafted by Wolfram to suit a student’s interests and abilities.
Credit to hidoba.com.
One of the first major events at the Wolfram High School Summer Research Program is a question and answer session with Wolfram, where topics range from entrepreneurship to physics to history to education.
Throughout the program, students have the opportunity to meet with Wolfram in small groups focused on particular topics. These small groups take a walk around campus to deep dive into something that interests them, get meaningful advice and share their interests. This is an unparalleled opportunity for engagement with a researcher and CEO.
It’s not all business, though, and students also have the chance to hang out with Wolfram on several evenings throughout the program. He usually has some cool technology to share, and students might get to experiment hands on with VR headsets, robots, drones, vintage math equipment or interesting software.
Recreational Experiences
While the Wolfram High School Summer Research Program is an intense academic program, we still have a lot of fun! Whether we’re learning to tango, singing a capella or building roller coasters for marbles, there’s a wealth of non-academic experiences to round out the summer.
Minors
Minors are an opportunity for students to deep dive into a creative or physical activity. For several evenings throughout the program, students get to stretch their brains in non-academic directions by choosing to join one of the minor tracks. Each year, we design minors around admitted students’ interests, giving options such as creative writing, nature walking, sports, music, dance, art, speech and debate, photography, and video game design.
Minors give students the chance to get to know a different group of people, learn a new skill and relax from the rigor of doing their projects.
MIT Field Trip
No summer program would be complete without a field trip—and the Wolfram High School Summer Research Program students have one of the best. After a quick ride into Boston, students take the time to explore the MIT Museum, where they find hundreds of hands-on, interactive exhibits focusing on problem solving with science and technology.
Once students have explored the museum, the MIT graduates on the staff team take interested students on a personal tour of the MIT campus, where they learn some secrets, take in the tourist destinations and learn about what life at the university is like.
Students have some free time to explore Cambridge and find some great food before we head back to Bentley for cozy activities and snacks.
Later in the program, students have a choice of mini-trips, either dinner or a hike at the Storer Conservation, in the city of Waltham, which is a great chance to take a break before the project deadlines.
Social Activities
Throughout the program, we have a schedule of social activities led by the teaching assistant team. From casual board games and spirited Model United Nations debates to soccer games and German crash courses, students have plenty of options to fill their evenings, make friends and share their hobbies.
The biggest social activities at the Wolfram High School Summer Research Program include a mini-hackathon, where students have just two hours to build a product based on a theme; a capture the flag coding competition, where students solve Wolfram Language puzzles to unlock a secret code; and a trivia night on popular culture, history, science and music.
Great Friends
One of the best things about bringing an eclectic group of bright students together is how quickly a community forms. Students who may struggle to develop deep friendships at school find themselves able to talk late into the night with people on their wavelength. The group as a whole is incredibly accepting and welcoming, and students get to socialize in a variety of environments from formal activities to free time.
In past years, students have started their own book clubs, lecture circles to share their interests, video game tournaments, pick-up ball games, walks around campus and music nights, creating a friendly and lively atmosphere for everyone.
Many students stay in touch for years after the program, cementing truly lifelong friendships with research partners, startup cofounders and energetic debate opponents.
After Summer
College Destinations
Graduates from Wolfram programs have gone on to attend top universities from around the world, and usually manage to find each other there! With groups of alumni meeting regularly at MIT, UC Berkeley, Stanford University, the Georgia Institute of Technology and the University of Chicago, there are ready-made communities for talented students to join.
We are always happy to write letters of recommendation for successful alumni, helping students gain access to their top choices of schools.
There are also plenty of opportunities to stay involved with Wolfram programs, getting steadily more advanced. You can find more information about our advanced programs in our post about the Wolfram education ecosystem.
Ready to Attend the Wolfram High School Summer Research Program?
There are so many reasons why this program can be one of the best experiences of a student’s high-school career. The students who thrive at the Wolfram High School Summer Research Program are the ones who love exploring big ideas and working on projects that blend unexpected disciplines. The program offers a chance to dive into advanced, interdisciplinary research alongside expert mentors who genuinely want to see students succeed. It’s an environment where curiosity and passion are celebrated and where students often find others who share their excitement for tackling complex problems and creating something unique. Whether students are exploring big scientific questions, pushing the boundaries of computational creativity or simply finding a community of people who “get” them, Wolfram’s education programs offer a space where their talents and perspectives will be welcomed and encouraged for years to come.
Applications for the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp open in November each year, and we hope to see you, your children or your students there!
Find more information about other educational opportunities:
Wolfram Emerging Leaders Program »
Wolfram U: Creative Computation »
Wolfram Student Ambassador Initiative »
Computational Adventures »
Watch the full AMA: Wolfram s Summer Education Programs to hear Eryn and Rory answer questions about the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp.Комментарии (0)
This summer marks the fifth annual Wolfram Middle School Summer Camp. Students at the camp learn the basics of Wolfram Language and make connections with other young STEM enthusiasts from around the world. Our goal with this fully virtual camp is to offer an on-ramp into other Wolfram programs for girls and gender non-conforming students with diverse academic backgrounds.
I’ve had the joy of running the camp alongside Program Director Rory Foulger since 2023 and am eager to walk through everything that goes into creating this experience.
Who’s Who
The Wolfram Middle School Summer Camp, along with other programs in the Wolfram education ecosystem, utilizes a “near-peer” framework for staffing. The primary staff for the camp are the mentors; these are high-school or college students who have been successful in Wolfram programs in previous years. Most often, mentors are alumni of the Wolfram High School Summer Research Program and the Wolfram Emerging Leaders Program (WELP). Two of our mentors for 2025 not only completed the Wolfram High School Summer Research Program but also started their journeys with Wolfram as students at the Wolfram Middle School Summer Camp!
Near-peer instruction allows our mentors, who share many qualities with our students, to deeply engage campers in their teaching. They’re experts in Wolfram Language and have engaged in a year of teacher training as part of WELP, so they can offer both a close relationship and high-quality instruction.
Mentors at the camp have a hand in almost every part of it. The mentors lead class blocks, assist students during explorations and design and run social activities. In previous years, mentors, who act as role models, teachers and social leaders for the students, have always been ranked as the best part of the camp.
In addition to the mentors, the camp is run by the same director team as other Wolfram education programs. We build a scaffolded curriculum and work directly with students along every step of the experience, assisting with the application process, leading program activities and supporting post-camp experiences.
Running a Virtual Program
While we started the Wolfram Middle School Summer Camp as a virtual camp due to the pandemic, we quickly found that being fully virtual was a major positive for the students. The camp is live and hands on, allowing students from all over the world to attend. Parents don’t need to worry about travel or about their child participating in a residential program before they’re ready, and students can study from the comfort of their own homes. A majority of our students are from the United States, but we’ve had students from all over the world, including Canada, the United Arab Emirates, India and Germany, attend.
We design both academic and social activities to not just account for the challenges of virtual teaching, but take advantage of it, like utilizing breakout rooms or choosing activities that are made specifically for online participation.
Learning Wolfram Language
Classes
We spend around half of the camp in classes. Each class covers a new area of functionality or a collection of related functions, ranging from manipulating lists and defining functions to creating graphics and audio. In their classes, our mentors balance teaching with hands-on activities. For each topic, concept or function within a class, students typically spend 15 minutes learning new content and 15 minutes with a mentor working on practical activities and exercises in small breakout groups.
We aim for students to not just learn what a function is, but for them to get comfortable with how to use it in different ways. Rather than listening to lectures, students spend time experimenting and working together in groups to solve problems, while also gaining the confidence they need to write their own code.
Explorations
While they are hands on, the activities students complete during class blocks are relatively short. Students at the camp explore computation deeper and build problem-solving skills with mini-projects we refer to as “explorations.” These explorations allow students to use the skills they’ve learned in class on more complex problems and learn about Wolfram Language outside of the core functionality covered in classes. The tasks in our explorations are complicated enough that students will gain experience in computational thinking and breaking down more complex programming problems, while still providing clear goals that students can reach within roughly an hour.
These explorations are expanded and updated with each new year of the camp, and new explorations are added based on new areas of functionality or student interests in the previous year. Some long-standing explorations include completing scavenger hunts for data within the Wolfram Knowledgebase, implementing image identifiers using machine learning and creating generative graphics and other forms of computational art.
Having a variety of these explorations not only allows us to expose students to a lot of different topics but also allows us flexibility in difficulty level. Explorations may have “extensions”—optional stretch goals for the students who want an extra challenge. Conversely, many of these explorations have smaller warm-up activities or a baseline goal that is simpler than the final goal for students who may need more time to grasp the necessary concepts. We consider not just the individual exploration, but all the explorations as a whole. We balance fun and complexity, with the goal that all students should have multiple explorations they would not only be able to complete but find fun or interesting.
Students get to pick which explorations they tackle during the week, which means if they are interested in a particular topic, want more practice with a certain function or just want to take a break from one exploration to try another, they can! We aim for every student to fully finish at least one exploration, which gives them the flexibility to try a lot of them during the camp while still having a finished project to take “home” with them.
Here are two computational art activities from the camp where students design repeating patterns to practice using Graphics and iterating functions like Table and create “wallpapers” while learning about representing images, pixels and colors in Wolfram Language.
Other Activities
Falling in between academic activities and social activities, we also show students a variety of experiences and perspectives on STEM through panels and Q&A sessions.
Panels
The camp holds two panels toward the end, and these serve to encourage our students to think about the possibilities they have for their futures. The first of these panels features Wolfram employees discussing their different paths in STEM and how they got where they are today. The second features the camp’s mentor team and is focused on their experiences in high school or college as well as their experiences with other Wolfram programs.
Stephen Wolfram at the Wolfram Middle School Summer Camp
In addition to Wolfram employees and the mentor team, students also get the opportunity to interact with Wolfram Research CEO Stephen Wolfram. Typically, the camp offers a Q&A session where students can ask questions about academics, science and life in general. Additionally, at the end of the program, students meet with Wolfram in smaller groups where they can ask further questions or discuss whatever interests them.
Social Activity
Despite being a virtual program, the camp has lots of social activities to encourage students to take breaks while working, foster friendships between students and just have fun!
Here are two sections from the Wolfram Middle School Summer Camp 2024’s “Camp Memories.” For the final social activity every year, students create a collaborative board of highlights and favorite moments from the camp.
Social activities are designed and facilitated by our mentors; each mentor is responsible for one or two activities, and the directors work with the mentor team to create a fun and balanced social activity schedule. This schedule balances what type of activity the students are doing (physical, academic, artistic) and interaction with their peers (individual activities, team competitions, group discussions). Favorite activities from previous years have been show-and-tell scavenger hunts, origami and drawing, and PowerPoint karaoke.
In addition to varying the type of activity, we also vary how computer dependent the activity is. While both students and staff enjoy online games like skribbl.io and GeoGuessr, we also mix in activities that are more relaxed and require students to take breaks from looking at their screens, like origami or yoga.
Beyond the Summer
The Wolfram Middle School Summer Camp is just the first step into Wolfram technology for young students. Students who attend the camp are in a fantastic position to attend the Wolfram High School Summer Research Program once they are 14. You can read more about the Wolfram High School Summer Research Program in the other blog posts from this series.
After completing higher-level Wolfram programs, students who have attended the Wolfram Middle School Summer Camp can also consider coming back to camp as mentors!
Watch the full “AMA: Wolfram’s Summer Education Programs” to hear Eryn and Rory answer questions about the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp.
This post discusses various Wolfram academic programs. You can read about the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp to learn more about these programs specifically.
Table of Contents:
Computational Adventures (for learners and hobbyists new to Wolfram and STEM) »
Creative Computation (for learners and hobbyists new to Wolfram and STEM) »
Summer Programs for Middle- and High-School Students (for middle-school students interested in STEM and high-school students familiar with STEM) »
Wolfram Emerging Leaders Program (for Wolfram High School Summer Research Program alumni) »
Becoming a Teaching Assistant »
Becoming a Mentor »
Stephen Wolfram s Coding Adventures »
Wolfram Student Ambassador Initiative (for students who are avid Wolfram Language users) »
Wolfram Summer School (for post-high-school students and adult learners) »
What’s on Offer at Wolfram?
The precollege education team at Wolfram runs a wide range of programs and experiences for students from middle school and up. Many of our students start out with asynchronous online programs to get a feel for computational thinking and coding before moving on to our synchronous online programs or in-person programs.
Asynchronous Programs
The asynchronous programs at Wolfram allow students to learn at their own pace, without the pressure of a peer group or live instructor. Our team has developed two main programs for this purpose, Computational Adventures and Creative Computation.
Computational Adventures »
One of the best starting points for younger or less experienced coders is Computational Adventures, a set of self-guided mini-projects designed to teach Wolfram Language and computational thinking to students of any age. Each adventure includes instructional material and a mini-project, with hints and solutions provided. Adventures range from computational art to cryptography and code-breaking and students have the opportunity to develop skills in a wide variety of areas.
Computational Adventures doesn’t require a teacher and can be used by a single student at home, clubs, groups, classes, homeschoolers or after-school activities.
Learn more about Computational Adventures:
Creative Computation »
The Wolfram U course Creative Computation explores computational art, poetry, audio and video game development. While all Wolfram U courses are great for learning various aspects of Wolfram Language, coding and applications, Creative Computation is designed specifically for beginners to get a project-based head start with computational thinking and Wolfram Language coding. Creative Computation doesn’t require any background in math or computer science and students will learn to code in a fun, hands-on way, creating a portfolio of projects to show off. Students can also earn a certificate of completion.
Anyone who has explored Wolfram Language can take the Proficiency in Wolfram Language exam to get a certificate, and students who’ve done significant computational research in Wolfram Language, including at the summer programs, can submit their work for the Applied Expertise in Computational Research certification program.
Synchronous Programs
Beyond online courses, our team offers several synchronous programs for various groups of students, including the Wolfram Middle School Summer Camp, Wolfram High School Summer Research Program and Wolfram Emerging Leaders Program.
Summer Programs
For young students, we offer the Wolfram Middle School Summer Camp, which is a week-long, fully remote program for girls and gender non-conforming students aged 11–14. Students explore coding in a fun and accessible way through mini-projects, hands-on exercises and activities.
Talented students aged 14–17 might be ready to join the Wolfram High School Summer Research Program, which is a 2.5-week, in-person program for motivated and bright high-school students passionate about STEM. Students publish a research-based paper on Wolfram Community showcasing their project work.
Learn more about student project work:
You can read about the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp to learn more about these programs specifically. Adult students may enjoy the Wolfram Summer School.
After the Wolfram High School Summer Research Program
The Wolfram education ecosystem doesn’t end after summer! Our team often jokes that we can entertain a student for at least 10 years, and it’s not a total exaggeration. After a student’s first summer at the Wolfram High School Summer Research Program, there are a lot of options to continue their educational journey with Wolfram well into adulthood.
Wolfram Emerging Leaders Program (WELP) »
Around half of Wolfram High School Summer Research Program students are selected to join the Wolfram Emerging Leaders Program (WELP) in the school year after their summer experience. Between September and January, students work remotely in small groups to formulate, execute and write up a longer and more in-depth research paper. While students receive plenty of support from mentors, they are encouraged to be more independent than they were over the summer. They learn how to structure their time, be accountable to their teammates and work independently—all great skills to learn in preparation for college.
At WELP, students are organized into groups based on their interests and goals. Do they want to do serious research? Build a product? Design educational materials for younger students? Develop specialized functions for the Wolfram Function Repository? All of the above? Each group develops a project proposal for consideration by the director team, going through a similar process to proposing a thesis project. Once their project has been approved, students utilize software development principles to structure their project and their time, setting realistic goals and working together to create something amazing.
At the end of the program, students publish their computational essays as evidence of their hard work. Many groups submit new functions to the Wolfram Function Repository and have their code incorporated into the language for others to use.
Students often participate in WELP more than once, picking a different topic and working with a different group each time. Once students finish high school, they’re invited to join the Wolfram Emerging Leaders Program for Undergraduates, where they complete an extended independent research project with the guidance of an expert mentor.
WELP projects have included modeling the dynamics of the earth-moon-sun system through the three-body problem, post-quantum hashing using chaotic double pendulum dynamics, mathematical explorations of functional iteration and roots, generating plant structures with 3D Lindenmayer systems, fake news detection and the modeling of electrophysiology in cardiac cells.
WELP is an unparalleled opportunity to develop computational research skills, connect with peers and engage deeply with the subject matter, all while under the tutelage of an expert in the field.
Becoming a Teaching Assistant
Around half of WELP students are invited to join our Teaching Assistant Training Program. This training is made up of seminar classes focused on pedagogy, social activity design, strategies for academic support and professional development. Students design new social activities, practice the skills they’d need to help younger students with coding problems, develop a presentation about a previous project and work on professional skills like public speaking, time management and resume writing.
Students in this program have the opportunity to apply to be teaching assistants (TAs) at the Wolfram High School Summer Research Program, where they complete their capstone for the program—actually leading activities, helping younger students and delivering talks.
Being a TA at the Wolfram High School Summer Research Program is an unforgettable experience and many TAs come back for multiple summers.
Becoming a Mentor
Particularly successful students often return once they’re a little older to join the mentor team at the Wolfram High School Summer Research Program. They bring the experience of being part of the program at almost every level and they’re able to connect with students in meaningful ways beyond helping them with their projects. Being a mentor at the Wolfram High School Summer Research Program allows students to demonstrate their programming and research skills, further develop their leadership and project management skills and get a real step up in their future career pathways.
Other students are invited to join the mentor team at the Wolfram Middle School Summer Camp, where they teach Wolfram Language and computational thinking, working with young students to explore coding for the first time.
Stephen Wolfram’s Coding Adventures
Aside from WELP and returning in different capacities to the summer programs, some students are invited to Stephen Wolfram’s Coding Adventures group. This group gets to explore some of the cutting-edge research Wolfram gets done on his weekends, discuss fascinating issues and participate in the actual process of scientific exploration and entrepreneurship.
Wolfram Student Ambassador Initiative »
For students who are excited about writing more computational essays, mentoring at international hackathons and acting as ambassadors for Wolfram technologies at their schools, the Wolfram Student Ambassador Initiative is a great place to be! Students join a bustling community of global ambassadors, exchange ideas and challenges, publish their work and get primary access to internships at the company.
Wolfram Summer School »
Adults, whether they are high-school program alumni or not, may enjoy the Wolfram Summer School, a similar program to the Wolfram High School Summer Research Program designed for undergraduates and up. Students at the Wolfram Summer School address a research question in foundational science, science and technology, ruliology, education innovation or philosophy and strategy. Guided by an expert mentor, students create computational essays to publish on Wolfram Community.
Why Join the Wolfram Education Ecosystem?
Students can join us at any entry point to the Wolfram education ecosystem and we encourage anyone with a bright young person in their lives to recommend our programs. Students get access to outstanding educational opportunities, a large community of alumni and a structured environment in which to experiment, deepen their knowledge and understanding, find new topics to get excited about and make new friends just like them.
Applications for the Wolfram High School Summer Research Program, the Wolfram Middle School Summer Camp and the Wolfram Summer School open in November each year. The Wolfram Student Ambassador Initiative, Computational Adventures and Wolfram U are available year-round. We hope to see you, your children or your students there!
Learn more about the application process:
Find more information:
Wolfram High School Summer Research Program »
Wolfram Middle School Summer Camp »
Wolfram Emerging Leaders Programs »
Wolfram Summer School »
Wolfram U: Creative Computation »
Wolfram Student Ambassador Initiative »
Computational Adventures »
Watch the full AMA: Wolfram s Summer Education Programs to hear Eryn and Rory answer questions about the Wolfram High School Summer Research Program and the Wolfram Middle School Summer Camp.Комментарии (0)
In the next few days, most people in the United States, Canada, Cuba, Haiti and some parts of Mexico will be transitioning from “standard” (or winter) time to “daylight” (or summer) time. This semiannual tradition has been the source of desynchronized alarm clocks, missed appointments and headaches for parents trying to get kids to bed at the right time since 1908, but why exactly do we fiddle with the clocks two times a year?
Why Do We Have Daylight Saving Time in the First Place?
The Sun has been humanity’s primary source for measuring the passage of time for almost all of human history, and, while it’s quite predictable for day-to-day uses, it has always had a few catches that have made timekeeping over longer durations or distances tricky. Unless you happen to live at or near the equator (in which case, you have a nearly constant 12-hour day/night cycle every day of the year), you’re no doubt aware that the length of the day changes throughout the year:
Compare this with the same period of time for a city located close to the equator:
This phenomenon gets more pronounced the farther away from the equator one moves:
Prior to the nineteenth century, most communities used local time determined by the Sun overhead. This variation throughout the year had little impact because time synchronization wasn’t necessary across long distances. You can see what your local solar time would read on a sundial with the SolarTime function:
This is pretty close to my current wall clock time (prior to the daylight saving time shift):
However, with the progression of industrialization and urbanization favoring the use of mechanical clocks (and in particular the advent of long-distance rail travel and telecommunication), standardized time quickly became a necessity. In 1847, Greenwich Mean Time (GMT) became the British standard, placing noon at the time when the mean Sun reached its zenith in Greenwich. You can see this in the time zone offset information for London, which prior to that time used a local solar time that was a fraction of an hour off from GMT:
This was, however, not without its problems. By midsummer, some parts of the UK were seeing sunrise near 3am, while sunset was happening at 9pm:
And because people didn’t simply adjust their daily routines to match the sunlight (because they were now typically working based off of standardized mechanical clocks) this resulted in “wasted” sunlight in the morning while people were sleeping and excess use of energy on artificial lighting in the evening.
People were quick to suggest resetting the standard time throughout the year to more closely align with daylight, but the idea didn’t really catch on until World War I, where it was motivated largely by fuel preservation—and it quickly caught on across Europe. During World War II, the UK actually instituted British Double Summer Time in which the clock was moved forward two hours during the summer to maximize the use of natural light.
Compare the fraction of the summer wherein Glasgow would have a pre-6am sunrise staying on GMT compared with a two-hour shift later:
Compare the same shift difference with a pre-9pm sunset over the same time:
Keeping Up with Daylight Saving Time
There have been many revisions to the schedule for daylight saving time within countries that observe it. There is even an entire database dedicated just to tracking these changing schedules across the globe, which is updated multiple times per year. This includes differing start/stop schedules, changes in regions that observe which shifts and which countries have opted to stop observing daylight saving time shifts altogether (typically choosing to stay on “summer time” when doing so).
Most of Mexico opted to stop observing daylight saving time at the end of 2022, for example:
To give this a try with another time zone, LocalTimeZone will identify the name of a time zone based on a location. You can also use TimeZoneConvert to identify the current time in a set location:
Changes in daylight saving time schedules, as well as the different dates on which offset changes take place, can lead to scheduling headaches for things like teleconferencing. Take, for example, the difference in time between offices in Chicago, Glasgow and Sydney throughout a period of just six weeks:
Because of the different onset and end dates for daylight saving time, nearly every week between the beginning of March and the first week of April ends up with some new difference in time zone offset between the three offices. The same thing happens again in the second half of the year when the first two cities transition off daylight saving time and the Australian office transitions back to it:
The US has also proposed (but not yet codified) a transition off of daylight saving time. Only time will tell if this semiannual tradition will continue, but in the meantime, Wolfram Language provides many tools for measuring and managing these time shifts. For more analysis on daylight times across the world, be sure to check out these posts from Wolfram Community:
Visualizing hours of daylight on the summer solstice
Circular sunset/sunrise calendarКомментарии (2)
Do you want to make optimal decisions against competition? Do you want to analyze competitive contexts and predict outcomes of competitive events? Do you need to elaborate strategies and plans against adversity and test the effectiveness of those strategies? Or are you simply an undergraduate student struggling to cope with a required course on game theory at your college?
Wolfram’s new suite of game theory functions will enable you to generate, play with, test, solve and visualize any event using game theory.
History of Game Theory
Originally, game theory was limited to simple games of chance. These have a few common characteristics: only two players are involved at a time and either one player wins and the other loses or both players have a zero payoff. These games are known today as two-player, zero-sum games.
If we define game theory based on the elaboration of optimal strategies, game theory may be as old as games of chance. We owe this early analysis (and probability theory!) to the famous polymath and gambler, Girolamo Cardano. Alternatively, if we define it on the analysis of games based on the possible actions of all players, then we may attribute its origins to James Waldegrave, for the analysis of the Le Her game in 1713, where minimax strategy solutions were given.
Of course, games can be more than just entertaining. After all, few games can claim to have a player base as big as the game of economics. Antoine-Augustin Cournot, in his 1838 research of the mathematical principles in the theory of wealth ( Recherches sur les principes mathematiques de la theorie des richesses ), discovered solutions to the price competition that would later be called Nash equilibria.
However, most of these discoveries are somewhat isolated and are usually considered as mere precursors to the modern subject. Game theory officially starts with John von Neumann’s 1944 book Theory of Games and Economic Behavior , where the term “game theory” was coined and the axioms of game theory were determined—thus establishing its own field. This overview of game theory history would be incomplete without mention of John Nash, whose existence theorem for Nash equilibria transformed game theory only a few years later in 1950.
As you may imagine, this field of study has grown tremendously since then. Modern game theory is best summarized as the mathematics of decision making. At its heart, it studies the behavior of human, animal and artificial players in all forms of competition. Herbert Gintis said it best:
“Game theory is about how people cooperate as much as how they compete Game theory is about the emergence, transformation, diffusion and stabilization of forms of behavior.”
Matrix Games: Cat and Mouse
Matrix games are also known as simultaneous games. Indeed, these games are characterized by the simultaneity of the actions of all players. As the name implies, matrix games are based on matrices. To be precise, any matrix game of n players may be expressed by an n + 1-dimensional array, where the last dimension is a vector of payoffs for all players.
Matrix games can be generated using the new MatrixGame function in Version 14.2. For example, here is a two-player game in which each player has a choice of two actions:
This game is a zero-sum game, as can be seen from the Dataset representing the payoffs for each player:
We use a payoff of 1 to represent winning, and a payoff of –1 to represent losing. As such, in this game, the first player wins when the actions of both players are matching, and the second player wins when the actions of both players aren’t.
While simultaneous games are usually expressed in terms of matrices, these rapidly became difficult to read as the number of players and actions increase. Hence, our team developed MatrixGamePlot for visualizing this class of games:
MatrixGamePlot and other functions are designed to work for games with any number of players. You may find that "SplitSquare" is a more intuitive layout for games with two players, while games with more than two players are better visualized using the default "BarChart":
As good as these visualizations may be, it is difficult to infer the story behind this game just from general visualizations. For ease of interpretation, consider two players: a cat and a mouse. The cat can either search the house or the yard, and the mouse can hide in the house or the yard. Of course, the cat wins if it is in the same place as the mouse and if they are not in the same place, the mouse wins.
The dataset and plot for the cat-and-mouse version of the game may be more easily read by specifying GamePlayerLabels and GameActionLabels in this game:
Of course, cats are lazy. It seems most likely that our cat is too sedentary to choose to go out of the house ( ) for a mere mouse. Knowing this, an opportunistic mouse should choose to always go to the yard ( ), as this should lead to a higher payoff. To verify this, the mouse should use MatrixGamePayoff on this game and strategy, which allows it to calculate the expected payoffs of a game based on a given strategy:
But what if the cat isn’t lazy? Some cats have those mystical eyes, unpredictable, a perfect poker face. To our mouse, it seems this cat will do whatever is necessary to win. The mouse must revisit its strategy, making it stable and strong, making sure it is at least as likely to win as the cat. The mouse uses VerifyMatrixGameStrategy to test all strategies where each animal chooses either the house or the yard, and to verify if a particular strategy is a Nash equilibrium. Unfortunately, it seems that all cases are unstable and the cat may be at an advantage:
Our mouse has one last ace up its sleeve: FindMatrixGameStrategies ! This powerful function is our game solver, computing Nash equilibria, that is, strategies that both players have no interest in changing. Using this tool, the mouse realizes its salvation lies in randomness:
This strategy is mixed, where the probability of some actions is not 0 or 1. In this case, the mouse will have to do a coin flip: if head, stay in the house, if tails, go to the yard. Note that the cat cannot take advantage of this strategy, so its best strategy is also a 50/50.
Granted, the game so far is a bit unrealistic. The mouse likely has many more places to hide, and the cat has many more places to search. This would imply that each player has more than two actions and indeed there are other games for which this happens. For example, in the Morra game, each player has 50 possible actions. All previously seen features are generalizable to this or any number of actions:
Of course, it’s hardly a party when only two players play at one time. You can also specify matrix games such as El Farol Bar with any number of players, although I wouldn’t wish that many mice on even my worst enemy:
Here, we use GameTheoryData , a helpful tool for classical games that will be explained later.
Tree Games: A Game-Changing Revolution
Tree games are also known as sequential games. In these games, there are multiple actions, each taken by a single player in a given order. Like a decision tree, each decision reduces the number of possible actions until an end node is reached, where all players have a given payoff. As the name implies, tree games are based on Tree data structures. As such, they can be generated using nested lists following the structure of trees. Tree games are a big revolution in game theory, as, instead of a single event, they allow the analysis of a group of asynchronous interconnected decisions. This extends the applicability of game theory to many complex phenomena, otherwise out of reach using matrix games. Chess, for example, is a tree game, although an extremely large one, making its direct analysis as a tree game impractical.
Calling something a game usually implies that it is done for entertainment and not to be taken too seriously. That’s not the case in game theory, where a game refers to an event consisting of one or multiple decisions. Indeed, by that definition, almost everything that ever happened is a game, from throwing a rock into a lake to overthrowing a monarchy. Consider the latter as a “game,” where a colony has the choice to either rebel or concede, and in return, if there is rebellion, the country may grant independence or suppress the rebellion, and if the colony concedes, the country may tax or not. This situation can be represented using TreeGame :
Classically, tree games are represented as trees. As such, the use of the "Tree" property may be sufficient for basic tree games. However, to have more control over the plotting and to represent accurately more advanced tree games, TreeGamePlot is likely a better visualization:
If the colony rebels, the country has the choice to either grant independence or suppress rebellion. Since losing a colony is costly, the country always has greater interest in suppressing rebellion, as shown by comparing the second payoff of each outcome:
Even only in terms of game theory, it is evident the country is advantaged in this game. Often in tree games, it is the player that plays last that is advantaged, as the payoff is chosen directly. In this case, whatever the colony employs, the country can ensure a positive payoff by simply choosing to suppress and tax depending on the action of the colony:
It turns out that being taxed is better than being suppressed for this colony. Thus, even though this may not satisfy both players, the subgame perfect equilibrium is found when the country taxes the colony. This can be shown by solving the tree game using FindTreeGameStrategies :
Of course, tree games can have more than just two consecutive actions. For example, consider this game called Centipede, for which the name choice is still a mystery to me:
Tree games aren’t limited to two players either. For example, consider an inheritance game where we track the inherited golden cactus belonging to grandfather Zubair through his entire family:
Game Data: A Numbers Game
Game theory is plagued with a seemingly limitless number of named games. Without playing the blame game, let’s just say it makes the subject unreasonably daunting to beginners. Truth be told, with a decent understanding of matrix and tree games, you’ll generally be able to understand 99% of all named games. Whether you need a game quick and easy or you’d like to analyze a game you’ve never heard of before, GameTheoryData can help you. GameTheoryData currently has nearly 50 curated games:
It turns out it is quite difficult to infer the meaning and utility of a game just from a list of payoffs. Since each game has its own story, origin, quirks and properties, we wanted to allow users to explore the richness of game theory with well-curated games right within the language. Thus, each game has handy features enabling anyone to understand, research and characterize it. You can find a textual description of the game:
Learn about its origins in the literature:
Find the game classes it belongs to:
And, of course, play with it:
Ahead of the Game
We know that game theory can be a bit obtuse to the uninitiated. In this respect, our first goal with these game theory features is to ease the process of constructing and playing with games, keeping it accurate and useful. Using extensive and approachable documentation, all these features have been kept in line with the fun and interesting character of games:
In fact, we ve poured a lot of effort into the documentation of these functions. The number of examples, explanations and extensive consideration of even minute features goes far beyond the typical documentation for new features at launch. This level of documentation should enable you to pick up game theory in a matter of minutes:
Links to all these resources are readily available in one place to enable users to have a complete overview of these functionalities:
Game Over
Our current game theory features cater to learners of game theory more than other clientele, but don’t worry. We won’t declare “game over” early. We have many ideas for the future of Wolfram’s game theory suite. We know Wolfram features aren’t complete without vast generalizability. It may take some time, but we hope some generalizations of current functions may lead to great applicability to research and professional contexts. Most importantly, we’d love to get some feedback about what our users would like to see in the next iteration of these features.
Don’t be late to the game and try out these new features for yourself!Комментарии (0)
When I read a recent New York Times article on AI, I didn’t think I would be following the footsteps of a Nobel laureate, but I soon discovered that I could do just that with Wolfram Language.
The Nobel Prize in Chemistry for 2024 was awarded for computational protein design and protein structure prediction, which have been active areas of research for several decades. The early work was built upon a foundation of physics and chemistry, attempting to model the folding of the chain of amino acid residues comprising a protein into a three-dimensional structure using conformational analysis and energetics. More recently, AI methods have been brought to bear on the problem, making use of deep neural networks (DNNs) and large language models (LLMs) such as trRosetta, AlphaFold and ESMFold. The work of David Baker, one of the laureates, was recently showcased in a New York Times article.
In their 2021 paper, Baker’s group described computational experiments that optimized a random sequence of amino acids into a realistic protein sequence and folded it into a three-dimensional structure. This process was repeated 2,000 times giving a “wide range of sequences and predicted structures.” The really exciting part came next: they made 129 synthetic genes in the lab based on the sequences, inserted them into the genome of E. coli bacteria, isolated and purified the new proteins and obtained their structures by x-ray crystallography and NMR spectroscopy, which closely matched the predicted structures.
We set out to explore the “computational X” part of their experiment in Wolfram Language. Some of the new features of the just-released Version 14.2 made this task surprisingly simple.
Folding an Amino Acid Sequence
Let’s begin with a protein of known structure. The N-terminal domain of the amyloid precursor protein (APP), which is implicated in Alzheimer’s disease, is a good example. The Protein Data Bank (PDB) entry ID is 1MWP. We can retrieve the citation title for the published data with this new-in-14.2 service connection request:
And we can retrieve the structure as a BioMolecule with this request:
We can also get the same result with much less typing with:
The full crystallographic structure is comprised of a single protein chain and many water molecules that co-crystallized with the protein molecule. The water molecules are collected into their own chain for convenience, so the BioMolecule has two chains:
The protein structure is comprised of a single chain that possesses two ?-helices and several ?-strands that can be seen visually here:
and tabulated here as residue ranges:
The "ESMAtlas" service is also new in 14.2 and allows one to fold a sequence using the model from Meta AI. This is the service request to fold the amino acid sequence:
The folded structure is also composed of an ?-helix and several ?-strands:
However, we can see here that the longer helix in the AI-folded structure is two residues shorter than in the crystal structure, and residues 73, 74 and 75 do not form a helix:
Two ?-strands have been lost in the AI-folded structure:
So, how good is the fold quantitatively? The ESMAtlas service computes an atom-wise confidence which is stored in the "BFactors" property of the BioMolecule. The individual values range from 0 to 1, with higher values indicating greater confidence of the predicted three-dimensional position. Here are the atomic confidences for the atoms of the first five residues:
We can use these values to compute an overall confidence of the folded sequence, specifically, the root mean square of the atomic values:
That’s pretty high, so it should be “close” to the experimental structure, and we can get an exact numerical comparison with the function BioMoleculeAlign, which is a prototype based on MoleculeAlign :
An RMS difference (RMSD) of the backbone atoms of 1.38 A is pretty good, and visually we can see that the folded structure closely matches the experimental structure fairly closely. As expected, the deviation is largest at the N- and C-terminal portions of the protein:
To get a broader overview of the folding accuracy, we did a search on the PDB website for monomeric, single-chain proteins with 95–105 residues, with the structure determined by x-ray diffraction, and a final resolution of 2.0 A) at the source, but there are several other potential issues that need to be dealt with.
First, databases are not perfect. Even though the search specified “protein entities,” some proteins conjugated with oligosaccharides were included in the search results. They have the chain type "Branched":
Here is what the first one looks like. The sugar moieties are rendered at atomic-level details, as in MoleculePlot3D :
So, let’s remove those two hits:
Second, Meta AI’s protein folding model only accepts sequences comprised of a very limited number of the more than 500 known naturally occurring amino acids, many of which are found in proteins in the PDB. There are 21 proteinogenic amino acids that are coded for by DNA, and ESMFold uses only 20 of them (selenomethionine is the maverick amino acid).
Amino acids are often represented with their three-letter abbreviations, Ala for alanine, Trp for tryptophane, etc. For even more brevity, biologists also use one-letter codes (only the proteinogenic amino acids have one), as shown in this table:
We can use the one-letter codes to construct a filter:
Let’s test it with APP that we retrieved from the PDB above:
So far, so good. The synthetic peptide 5V63 was made to study the oligomerization of the ?-amyloid protein and contains ornithine, sarcosine and iodophenylalanine. It should fail the test:
Great! Now to filter the hits:
Third, x-ray crystallography is not perfect. Many crystals are not ideal and contain defects. One common defect in crystals of proteins is disorder where a portion of the protein does not crystallize the same way in every unit cell, and this phenomenon effectively blurs the electron density (it’s the electrons that scatter the x-rays) and the atoms cannot be located. The disorder often happens at the ends of the protein chain, but it can also happen where there are loops between ?-helices and ?-strands.
To make the comparison of the protein folding results most informative, we should remove those hits that have fewer observed residues than the full sequence. The first hit, 1A68, has unobserved residues, as indicated by the smaller modeled monomer count:
Processing the whole list and selecting those entries with equal counts gives us the final list of hits:
And, finally, we can do the analysis, that is, folding the sequence and comparing it to the experimental geometry:
Most of the folded structures agree with the experimental structure quite well with an RMSD of 4 A or less:
Overall, the results look quite good. A large RMSD is to be expected when the fold confidence is less than 0.75, so the unexpected outliers have confidence greater than 0.75 and an RMSD greater than about 5 A. What are they?
The structure 4J4C is the G51P mutation (proline replacing glycine at residue position 51) of 3EZM. Both are head-to-tail dimers that are intertwined. We can use the “assembly” information of the biomolecules to view the dimers (one half of each dimer is shown in blue and the other half in yellow):
The ESMFold model assumes the input sequence is for a monomeric structure, so it’s not surprising that it fails for these intertwined dimers.
Optimizing a Random Sequence
Baker’s group carried out the de novo design by first constructing a random sequence and then iteratively mutating one residue at a time. The position of the mutation was randomly selected from a uniform distribution as was the new amino acid. The sequence was folded at each iteration and the change was accepted if the fitness of the predicted structure of the mutated sequence, Fi , increased. For a decrease of the fitness, the change was accepted based on the Metropolis criterion
where t is the temperature, which was decreased in steps over the course of the iteration, effectively giving a simulated annealing algorithm. Strictly speaking, simulated annealing uses energy instead of fitness, and the temperature then has a physical meaning. They used the contrast between the inter-residue distance distributions predicted by the trRosetta network and background distributions averaged over all proteins as the fitness and an initial temperature scaled appropriately. They used initial sequences of 100 residues and an arbitrarily large number, 40,000, of iterations. We’ll follow this basic outline and adapt as needed for Wolfram Language.
Initial Random Sequence
We’ve already talked a little bit about amino acids and protein sequences. The BioSequence returned by the "BioSequences" property of a BioMolecule can return either the three-letter code or the one-letter code sequences as a string. For the amyloid precursor protein, we have:
Using the one-letter codes will be convenient for constructing random sequences and manipulating them:
Here is a random sequence of 100 residues:
Folding the sequence gives a BioMolecule, as we have seen before:
We can see that it doesn’t have any secondary structure elements and doesn’t look very much like a naturally occurring protein:
The residues have been colored starting at the N-terminal end blue through green, to yellow, to orange and finally to red at the C-terminal end.
Fitness
While we can compute the inter-residue distance distributions for the predicted fold, we don’t have the background distributions averaged over all proteins (e.g. from the PDB) used by Baker’s team, and therefore we cannot compute the divergence to use as the fitness.
However, all is not lost because as we have seen above, we can compute an overall confidence of a fold, which should be suitable as fitness. Not surprisingly, the fitness of the predicted fold for this random sequence is not very high:
Residue Mutation
The next thing we need to be able to do is mutate the sequence. First, a position in the sequence is randomly chosen, and then the amino acid at that position is replaced by a different amino acid:
We can use Diff to see where the mutation took place:
The valine at position 96 was replaced by alanine. That is, we have made the V96A mutant. What is the effect on the structure?
Interestingly, the effect it is not entirely local, and three ?-helices have emerged well separated from the location of the mutation (the short red helix). Does that lead to an increase or decrease in the overall fitness?
Simulated Annealing Optimization
The fitness, i.e. confidence of the prediction, has decreased slightly. The Metropolis criterion for accepting the mutation is computed as:
The test for acceptance is:
So this change with its slight decrease in fitness would be accepted.
We can roll up the preceding code into a function for sequence optimization. The ESMAtlas service limits the number of API calls a user can make in a given period of time, but the details are not disclosed. A pausing mechanism has been built into the code to accommodate the throttle imposed by the service. We’ve also included a progress monitor because making API calls can be slow depending on how many other users are calling the service:
This is an example of the progress monitor display:
To keep this proof-of-concept exercise tractable, let’s use only 1,000 iterations:
Optimization Results
The returned result is a list of pairs of the form {sequencei ,fitnessi }, where the fitness is the overall confidence of the fold. Let’s take a look at the final sequence to see how we did:
That’s really quite nice! Are there any structures in the PDB that have a similar sequence?
None. How about the UniProt database? This search was done manually, and it found one hit. Here is a snippet of the raw BLAST output:
Less than half of our sequence (residues 2–46) had some similarity to residues 242–286 of the 467-residue protein 3-isopropylmalate dehydratase large subunit. Statistically, the hit is not very good. The E-score is 9.7 (the “Expect” value in the output), and good homology matches have an E-score of 10-5 or less. It’s safe to say that our de novo–designed protein has not been seen before.
What else can we learn from the optimization? Here is how the fitness improved over the optimization:
A large fraction of the iterations did not change the sequence:
And here is how the change in fitness evolved (the zeros have been elided):
Here is where the mutations took place over the course of the optimization:
And here is the residue position frequency distribution:
Twelve residue positions (6, 8, 46, 53, 62, 64, 69, 71, 82, 83, 96, 98) were not modified. Either they were not selected or the changes happened to be deleterious and failed to pass the Metropolis criterion. The best remedy would be to use more iterations (Baker used 40,000).
How did the amino acid content evolve, and what is the final distribution?
Isoleucine (I), arginine (R) and threonine (T) are the most frequent amino acids in the last sequence, and methionine (M) was lost altogether.
How did the geometry of the fold evolve? Let’s take 10 examples from a geometric progression through the iterations and fold them:
Now, starting with the last biomolecule, align the preceding biomolecule to it and repeat the process sequentially back to the first biomolecule of the sample. Computationally we do this with the function Fold , consuming elements of the reversed list of structures and appending each aligned structure to the growing result list:
Plotting the alignment RMSD will give a rough idea of the pairwise structural similarity of the sample:
Now, let’s take a look at the structures. In the plots below, the residues have been colored by the confidence of the prediction. The first number in each panel is the overall confidence of the fold, and the second number is the step in the iteration:
By the time the overall confidence reaches about 0.7, the fold has settled down.
Another way to assess the evolution of the fold makes use of inter-residue contact maps. As observed by Baker’s team, the maps are initially diffuse and become sharper over the course of the optimization:
Out of curiosity, we manually submitted the optimized sequence to the trRosetta server. Here are the folds that were predicted with the use of templates, along with the overall confidence:
The trRosetta model gives an atomic position confidence on a scale of 0 to 100, and the overall confidence for each fold is not very high.
The folding report included these remarks:
The confidence of the model is very low. It was built based on de novo folding, guided by deep learning restraints.
This is a single-sequence folding with trRosettaX, as no significant sequence homologs was detected (in uniclust30_2018_08).
How do they compare to our optimized structure predicted by ESMAtlas? Not very closely based on the alignment RMSD:
How Much Is Enough?
So, how many iterations is enough to get useful results? Even with today’s fast personal computers and fast internet speeds, it can take hours to evolve a sequence using the API service. As we saw above, 1,000 iterations is not enough to sample all 100 residue positions even once, much less repeatedly, to find an optimal amino acid for each.
5,000 Iterations
This iteration is not merely an extension of the previous one, even though they both began with the same initial random sequence and the same random state. This is because the half-life of the temperature decay is longer, allowing this iteration to diverge as soon as the Metropolis criterion gives a different outcome. This fact becomes obvious once the fitness of the two iterations is compared:
The final optimized sequence bears no resemblance to the one for the shorter iteration:
The amino acid distribution is quite different, also. Most notably, several amino acids are no longer present: phenylalanine (F), histidine (H), proline (P) and tryptophane (W):
So, what is the fold for this sequence?
It’s all one long ?-helix! Considering that there are no proline residues, this result is not surprising. Proline contains a ring that restricts the rotation about the angle, usually resulting in a kink in the backbone at that position. So, no proline residues means no turns:
The plots below show that the general topology of the final fold is reached fairly early, at about 40% of the way through the iteration when the confidence is around 0.75, as we saw in the shorter iteration:
10,000 Iterations
The evolution of the fitness for all three iterations is shown below. The 5,000- and 10,000-step iterations have produced sequences with very high confidence, but they have by no means converged to a maximal confidence, as shown in the inset:
While the sampling of the residue positions is about twice as high as the shorter 5,000-step iteration, it is more diffuse:
Moreover, only 15% of the residue positions experienced every amino acid at least once (all-green columns in the plot below), and none of the amino acids resided at least once at each of the residue positions (all-green rows, only alanine came close). So, going to 20,000 or even 40,000 steps (as did Baker) many be necessary for adequate annealing:
We again see the loss of several amino acids, and notably proline is among the absent:
The final sequence is:
Another all ?-helix fold, as foretold by the complete absence of proline. How similar are the two sequences?
There is some similarity between the two sequences, but it’s not very high. Can we discovery why a proline-free sequence is the result of this optimization?
The Curious Case of Proline
One hypothesis might be that once proline is absent, it’s hard to restore it. The first proline-free sequence is encountered at step 2602:
What is the probability of a successful replacement by proline at each residue position? To compute those probabilities, we need the fitness (overall confidence of the fold) of the proline-free sequence and each of the proline-substituted sequences:
We also need the temperature at that step:
And finally the probabilities:
We see that a large fraction of the residue positions have a very low probability, but there is still a sizable number with a probability of 1, and, in fact, proline does reappear at a later step only to be ultimately lost. The last step to lose a proline is 5885:
Computing the probabilities to restore a proline at this point in the iterations follows:
So it has become very much harder to restore proline to the sequence and one ends up with a sequence that folds into an all ?-helical form.
A Weighty Improvement
Not all residue positions are created equal, and when it comes to structure prediction, some are better characterized than others. Since we’re doing design by optimization, could we improve the process by preferentially mutating the residue positions with lower prediction confidence? That is, give more attention to the less-well-defined regions.
We used residue-wise confidence to color code the folded sequences, and we can use the same residue-wise confidence to weight the residue selection in mutation.
Here is a refactored mutation function that can take an optional set of residue position weights:
Taking the optimized sequence from the 1,000-iteration optimization as the input, here is a mutation without weights:
The confidence-based weights are calculated with:
Here is the mutation with those weights, starting, of course, from the same random state:
We can add an option to the sequenceSimulatedAnnealing function to use confidence-based weights (code changes highlighted in blue):
The time course of the fitness for the weighted optimization is roughly the same as for the unweighted. It’s not immediately obvious if the difference in fitness at the end is significant:
Surprisingly, substantially fewer mutations met the Metropolis criterion:
And that most likely explains the lower fitness at the end. There is a hint of more frequent mutation at the C-terminal end of the sequence (residue 100) in the plot on the right. This is due to the characteristically lower confidence in the residue geometries at the ends of the sequence. In a different 10,000-step optimization, it was much more obvious:
There is almost no sequence similarity between the two final sequences:
And, happily, the folded sequence has a different topology:
Again, the public protein databases are devoid of similar sequences:
Four hits were found in the UniProt database, with the best, although not very good with the E-score= 3.4, being:
Summary
Well, what have we learned on our computational expedition? I think foremost is that surprisingly little coding was necessary. We created a few “one-liners” (confidence, acceptableSequenceQ, randomSequence, residueConfidence, mutate) and only one big function (sequenceSimulatedAnnealing). Everything else that we needed was built into Wolfram Language and just worked.
The ability to start with just a sequence of amino acid codes (1° structure) and in one step obtain a realistic three-dimensional protein structure (3° structure) is utterly amazing and deeply satisfying. The advent of LLMs is truly worthy of a Nobel Prize, and that we can easily climb onto the shoulders of those giants is breathtaking.
We also learned that experimental data can be challenging to use. Doing good science requires attention to detail and frequently asking why a particular result was obtained. As we went along, we postulated hypotheses and tested them.
We’ve only just scratched the surface of computational biology, and Wolfram Language will allow us to go much further.
Ideas for Further Exploration
One area for fruitful exploration is the correlation of amino acid properties with the optimized folds. Where are the polar and nonpolar residues located three-dimensionally? What about charged residues, such as arginine, histidine, aspartate and glutamate?
What is the effect of increasing or decreasing the rate of cooling? We set the temperature half-life to be 1/8 of the number of iterations, as was used by Baker’s group. However, we used a continuous protocol while they used a stepwise protocol.
Are there other optimization strategies that might be more efficient? We’ve already seen that weighting the residue position selection by 1 residueConfidence increases the sampling of the less-well-defined regions of the chain. Is there a weighting by amino acid that could be exploited? What would be the effect of giving a preference to some amino acids over others? For example, there is a class of proteins known as glycine-rich proteins that contains more than 60% glycine residues and are found in tissues of many eukaryotic organisms.
Many proteins contain disulfide bridges between cysteine residues. How could this feature be incorporated into the random sequence generation and subsequent mutation? Can ESMAtlas fold sequences with this topological constraint?
What other optimization goals could one use? We optimized the confidence of the fold. How could you optimize for a particular shape or combination of helices and sheets? How could you optimize an enzyme active site or a receptor binding pocket?
Acknowledgments
Special thanks are due for Jason Biggs of the Wolfram Chemistry Team for useful discussions, quick bug fixes and solid code design and development for the new BioMolecule framework. Soutick Saha of the Chemistry Team has also been helpful guiding my sometimes wayward steps through the plethora of online protein and bioinformatics resources, and he made several suggestions to improve this post. Jon McLoone made some improvements to the MarginalPlot resource function that gave me better control over the histograms.Комментарии (1)