Wolfram BlogЧт., 14 дек. Текст источника в новой вкладке
News, views, and ideas from the front lines at Wolfram Research.

1. Creating Mathematical Gems in the Wolfram LanguageЧт., 14 дек.[−]

The Wolfram Community group dedicated to visual arts is abound with technically and aesthetically stunning contributions. Many of these posts come from prolific contributor Clayton Shonkwiler, who has racked up over 75 “staff pick” accolades. Recently I got the chance to interview him and learn more about the role of the Wolfram Language in his art and creative process. But first, I asked Wolfram Community’s staff lead, Vitaliy Kaurov, what makes Shonkwiler a standout among mathematical artists.

Stereo VisionRise Up“Stereo Vision” and “Rise Up”

“Clay, I think, pays special attention to expressing a math concept behind the art,” Kaurov says. “It is there, like a hidden gem, but a layman will not recognize it behind the beautiful visual. So Clay’s art is a thing within a thing, and there is more to it than meets the eye. That mystery is intriguing once you know it is there. But it’s not easy to express something as abstract and complex as math in something as compact and striking as a piece of art perceivable in a few moments. This gap is bridged with help from the Wolfram Language, because it’s a very expressive, versatile medium inspiring the creative process.”

Shonkwiler is a mathematics professor at Colorado State University and an avid visual artist, specializing in Wolfram Language–generated GIF animations and static images based on nontrivial math. “I am interested in geometric models of physical systems. Currently I’m mostly focused on geometric approaches to studying random walks with topological constraints, which are used to model polymers,” he says.

In describing how he generates ideas, he says, “There are some exceptions, but there are two main starting points. Often I get it into my head that I should be able to make an animation from some interesting piece of mathematics. For example, in recent months I’ve made animations related to the Hopf fibration.”

Stay Upright
“Stay Upright”

"Stay Upright" code

DynamicModule[{n = 60, a = \[Pi]/4,
  viewpoint = {1, 1.5, 2.5}, \[Theta] = 1.19, r = 2.77, plane,
  cols = RGBColor /@ {"#f43530", "#e0e5da", "#00aabb", "#46454b"}},
 plane = NullSpace[{viewpoint}];
    Table[{Blend[cols[[;; -2]], r/\[Pi]],
       RotationMatrix[\[Theta]].plane.# & /@ {{Cot[r] Csc[a], 0,
          Cot[a]}, {0, Cot[r] Sec[a], -Tan[a]}}]}, {r, \[Pi]/(2 n) +
       s, \[Pi], 2 \[Pi]/n}]}, Background -> cols[[-1]],
   PlotRange -> r, ImageSize -> 540], {s, 0., 2 \[Pi]/n}]]

Like many artists, Shonkwiler draws inspiration from existing art and attempts to recreate it or improve upon it using his own process. He says, “Whether or not I actually succeed in reproducing a piece, I usually get enough of a feel for the concept to then go off in some new direction with it.”

As to the artists who inspire him, Shonkwiler says, “There’s an entire community of geometric GIF artists on social media that I find tremendously inspiring, including Charlie Deck, davidope, Saskia Freeke and especially Dave Whyte. I should also mention David Mrugala, Alberto Vacca Lepri, Justin Van Genderen and Pierre Voisin, who mostly work in still images rather than animations.” If you want to see other “math art” that has inspired Shonkwiler, check out Frank Farris, Kerry Mitchell, Henry Segerman, Craig Kaplan and Felicia Tabing.

Another artistic element in Shonkwiler’s pieces is found in the title he creates for each one. You’ll find clever descriptors, allusions to ancient literature and wordplay with mathematical concepts. He says he usually creates the title after the piece is completely done. “I post my GIFs in a bunch of places online, but Wolfram Community is usually first because I always include a description and the source code in those posts, and I like to be able to point to the source code when I post to other places. So what often happens is I’ll upload a GIF to Wolfram Community, then spend several minutes staring at the post preview, trying to come up with a title.” Although he takes title creation seriously, Shonkwiler says, “Coming up with titles is tremendously frustrating because I’m done with the piece and ready to post it and move on, but I need a title before I can do that.”


"Interlock" code

Stereo[{x1_, y1_, x2_, y2_}] := {x1/(1 - y2), y1/(1 - y2),
   x2/(1 - y2)};

With[{n = 30, m = 22, viewpoint = 5 {1, 0, 0},
  cols = RGBColor /@ {"#2292CA", "#EEEEEE", "#222831"}},
        RotationTransform[-s, {{1, 0, 0, 0}, {0, 0, 0, 1}}][
         1/Sqrt[2] {Cos[\[Theta]], Sin[\[Theta]], Cos[\[Theta] + t],
           Sin[\[Theta] + t]}]], {\[Theta], 0., 2 \[Pi],
        2 \[Pi]/n}]], {t, 0., 2 \[Pi], 2 \[Pi]/m}]},
   ViewPoint -> viewpoint, Boxed -> False, Background -> cols[[-1]],
   ImageSize -> 500, PlotRange -> 10, ViewAngle -> \[Pi]/50,
   Lighting -> {{"Point", cols[[1]], {0, -1, 0}}, {"Point",
      cols[[2]], {0, 1, 0}}, {"Ambient", RGBColor["#ff463e"],
      viewpoint}}], {s, 0, \[Pi]}]]

Other Wolfram Community members have complimented Shonkwiler on the layers of color he gives his geometric animations. Likewise, his use of shading often enhances the shapes within his art. But interestingly, his work usually begins monochromatically. “Usually I start in black and white when I’m working on the geometric form and trying to make the animation work properly. That stuff is usually pretty nailed down before I start thinking about colors. I’m terrible at looking at a bunch of color swatches and envisioning how they will look in an actual composition, so usually I have to try a lot of different color combinations before I find one I like.”

Shonkwiler says that the Wolfram Language makes testing out color schemes a quick process. “If you look at the code for most of my animations, you’ll find a variable called cols so that I can easily change colors just by changing that one variable.”

Magic CarpetSquare Up“Magic Carpet” and “Square Up”

I asked Shonkwiler if he conceives the visual outcome before he starts his work, or if he plays with the math and code until he finds something he decides to keep. He said it could go either way, or it might be a combination. “‘Magic Carpet’ started as a modification of ‘Square Up,’ which was colored according to the z coordinate from the very earliest versions, so that’s definitely a case where I had something in my head that turned out to require some extra fiddling to implement. But often I’m just playing around until I find something that grabs me in some way, so it’s very much an exploration.”

Renewable ResourcesInner Light“Renewable Resources” and “Inner Light”

Shonkwiler actually has a lot of pieces that are related to each other mathematically. Regarding the two above, “They’re both visualizations of the same M?bius transformation. A M?bius transformation of a sphere is just a map from the sphere to itself that preserves the angles everywhere. They’re important in complex analysis, algebraic geometry, hyperbolic geometry and various other places, which means there are lots of interesting ways to think about them. They come up in my research in the guise of automorphisms of the projective line and as isometries of the hyperbolic plane, so they’re often on my mind.”

“To make ‘Inner Light,’ I took a bunch of concentric circles in the plane and just started scaling the plane by more and more, so that each individual circle is getting bigger and bigger. Then I inverse-stereographically project up to the sphere, where the circles become circles of latitude and I make a tube around each one. ‘Renewable Resource’ is basically the same thing, except I just have individual points on each circle and I’m only showing half of the sphere in the final image rather than the whole sphere.”

When I asked Shonkwiler about his philosophy on the relationship between math and aesthetics, he said, “Part of becoming a mathematician is developing a very particular kind of aesthetic sense that tells you whether an argument or a theory is beautiful or ugly, but this has probably been overemphasized to the point of clich?.”

However, Shonkwiler continued to mull the question. “I do think that when you make a visualization of a piece of interesting mathematics, it is often the case that it is visually compelling on some deep level, even if not exactly beautiful in a traditional sense. That might just be confirmation bias on my part, so there’s definitely an empirical question as to whether that’s really true and, if it is, you could probably have a metaphysical or epistemological debate about why that might be. But in any case, I think it’s an interesting challenge to find those visually compelling fragments of mathematics and then to try to present them in a way that also incorporates some more traditional aesthetic considerations. That’s something I feel like I’ve gotten marginally better at over the years, but I’m definitely still learning.”

Here is Shonkwiler with one of his GIFs at MediaLive x Ello: International GIF Competition in Boulder, Colorado:

Clay at GIF competition

Check out Clayton Shonkwiler’s Wolfram Community contributions. To explore his work further, visit his blog and his website. Of course, if you have Wolfram Language–based art, post it on Wolfram Community to strike up a conversation with other art and Wolfram Language enthusiasts. It’s easy and free to sign up for a Community account.

Комментарии (0)

2. Tracking a Descent to Savagery with the Wolfram Language: Plotting Sentiment Analysis in Lord of the FliesЧт., 07 дек.[−]

Computation is no longer the preserve of science and engineering, so I thought I would share a simple computational literary analysis that I did with my daughter.

Shell Lord of the Flies

Hannah’s favorite book is Lord of the Flies by William Golding, and as part of a project she was doing, she wanted to find some quantitative information to support her critique.

Spoiler alert: for those who don’t know it, the book tells the story of a group of schoolboys shipwrecked on an island. Written as a reaction to The Coral Island, an optimistic and uplifting book with a similar initial premise, Lord of the Flies instead relates the boys’ descent into savagery once they are separated from societal influence.

The principle data that Hannah asked for was a timeline of the appearance of the characters. This is a pretty straightforward bit of counting. For a given character name, I can search the text for the positions it appears in, and while I am at it, label the data with Legended so that it looks nicer when plotted.

nameData[name_, label_] :=    Legended[First /@ StringPosition[$lotf, name, IgnoreCase -> True]/    StringLength[$lotf], label]; nameData[name_] := nameData[name, name];

The variable $lotf contains the text of the book (there is some discussion later about how to get that). By dividing the string position by the length of the book, I am rescaling the range to 0–1 to make alignment with later work easier. Now I simply create a Histogram of the data. I used a SmoothHistogram, as it looks nicer. The smoothing parameter of 0.06 is rather subjective, but gave this rather pleasingly smooth overview without squashing all the details.
characters = SmoothHistogram[{    nameData["Ralph"],    nameData["Jack"],    nameData["Beast", "The Beast"]}, 0.06, PlotStyle -> Thickness[0.01],   PlotRange -> {{0, 1}, All}, PlotLabel -> "Character appearances",   Ticks -> None]

Already we can see some of the narrative arc of the book. The protagonist, Ralph, makes an early appearance, closely followed by the antagonist, Jack. The nonexistent Beast appears as a minor character early in the book as the boys explore the island before becoming a major feature in the middle of the book, preceding Jack’s rise as Jack exploits fear of the Beast to take power. Ralph becomes significant again toward the end as the conflict between he and Jack reaches its peak.

But most of Hannah’s critique was about meaning, not plot, so we started talking about the tone of the book. To quantify this, we can use a simple machine learning classifier on the sentences and then do basic statistics on the result.

By breaking the text into sentences and then using the built-in sentiment analyzer, we can hunt out the sentence most likely to be a positive one.

MaximalBy[TextSentences[$lotf],   Classify["Sentiment", #, {"Probability", "Positive"}] &]

The classifier returns only "Positive", "Negative" and "Neutral" classes, so if we map those to numbers we can take a moving average to produce an average sentiment vector with a window of 500 sentences.

sentimentData =    Legended[MeanFilter[      ReplaceAll[       Classify["Sentiment", TextSentences[$lotf]] , {"Positive" -> 1,         "Negative" -> -1, "Neutral" | Indeterminate -> 0}], 500]*20,     "Sentiment"];

Putting that together with the character occurrences allows some interesting insights.

We can see that there is an early negative tone as the boys are shipwrecked, which quickly becomes positive as they explore the island and their newfound freedom. The tone becomes more neutral as concerns rise about the Beast, and turn negative as Jack rises to power. There is a brief period of positivity as Ralph returns to prominence before the book dives into bleak territory as everything goes bad (especially for Piggy).

Digital humanities is a growing field, and I think the relative ease with which the Wolfram Language can be applied to text analysis and other kinds of data science should help computation make useful contributions to many fields that were once considered entirely subjective.

Appendix: Notes on Data Preparation

Because I wanted to avoid needing to OCR my hard copy of the book, I used a digital copy. However, the data was corrupted by some page headers, artifacts of navigation hyperlinks and an index page. So, here is some rather dirty string pattern work to strip out those extraneous words and numbers to produce a clean string containing only narrative:

$lotf = StringReplace[    StringDelete[     StringReplace[      StringDrop[       StringDelete[        Import["http://<redacted>.txt"], {"Page " ~~ Shortest[___] ~~           " of 290\n\nGo Back\n\nFull Screen\n\nClose\n\nQuit",                           "Home Page\n\nTitle Page\n\nContents\n\n!!\n\n\"\"\n\n!\n\n\ \""}], 425],       "\n" -> " "], {"Home Page  " ~~ Shortest[___] ~~        "  Title Page  Contents", "!!  \"\"  !  \"    ", "\f"}],     "  " -> " "];

Appendix 2: Text Labels

This is the code for labeling the plot with key moments in the book:

sentenceLabel[{sentence_String, label_}] :=   Block[{pos =      Once[First[FirstPosition[TextSentences[$lotf], sentence]]]},   Callout[{pos/Length[TextSentences[$lotf]], sentimentData[[1, pos]]},     label, Appearance -> "Balloon"]]

eventLabels = ListPlot[sentenceLabel /@     {      {"As they watched, a flash of fire appeared at the root of one \ wisp, and then the smoke thickened.", "The fire"},            {"Only the beast lay still, a few yards from the sea.",        "Simon dies"},      {"Samneric were savages like the rest; Piggy was dead, and the \ conch smashed to powder.", "Piggy dies"},      {"\[OpenCurlyDoubleQuote]I\[CloseCurlyQuote]m chief then.\ \[CloseCurlyDoubleQuote]", "Ralph leads"},      {"We aren\[CloseCurlyQuote]t enough to keep the fire burning.\ \[CloseCurlyDoubleQuote]", "Ralph usurped"}}];

Appendix 3: Conch Shell Word Cloud Image

It doesn’t provide much insight, but the conch word cloud at the top of the article was generated with this code:

img = Import["http://pngimg.com/uploads/conch/conch_PNG18242.png"]; ImageMultiply[  Rasterize[   WordCloud[DeleteStopwords[$lotf], AlphaChannel[img], MaxItems -> 50,     WordSelectionFunction -> (StringLength[#] >= 4 &)],    ImageSize -> ImageDimensions[img]], ImageAdjust[img, {-0.7, 1.3}]]

Reference—conch shell: Source image from pngimg.com Creative Commons 4.0 BY-NC.

Комментарии (0)

3. Finding X in Espresso: Adventures in Computational LexicologyЧт., 30 нояб.[−]

When Does a Word Become a Word?

“A shot of expresso, please.” “You mean ‘espresso,’ don’t you?” A baffled customer, a smug barista—media is abuzz with one version or another of this story. But the real question is not whether “expresso” is a correct spelling, but rather how spellings evolve and enter dictionaries. Lexicographers do not directly decide that; the data does. Long and frequent usage may qualify a word for endorsement. Moreover, I believe the emergent proliferation of computational approaches can help to form an even deeper insight into the language. The tale of expresso is a thriller from a computational perspective.

In the past I had taken the incorrectness of expresso for granted. And how could I not, with the thriving pop-culture of “no X in espresso” posters, t-shirts and even proclamations from music stars such as “Weird Al” Yankovic. Until a statement in a recent note by Merriam-Webster’s online dictionary caught my eye: “… expresso shows enough use in English to be entered in the dictionary and is not disqualified by the lack of an x in its Italian etymon.” Can this assertion be quantified? I hope this computational treatise will convince you that it can. But to set the backdrop right, let’s first look into the history.

Expresso in video segment No X in espresso poster

History of Industry and Language

In the 19th century’s steam age, many engineers tackled steam applications accelerating the coffee-brewing process to increase customer turnover, as coffee was a booming business in Europe. The original espresso machine is usually attributed to Angelo Moriondo from Turin, who obtained a patent in 1884 for “new steam machinery for the economic and instantaneous confection of coffee beverage.” But despite further engineering improvements (see the Smithsonian), for decades espresso remained only a local Italian delight. And for words to jump between languages, industries need to jump the borders—this is how industrial evolution triggers language evolution. The first Italian to truly venture the espresso business internationally was Achille Gaggia, a coffee bartender from Milan.

Expresso timeline

In 1938 Gaggia patented a new method using the celebrated lever-driven piston mechanism allowing new record-brewing pressures, quick espresso shots and, as a side effect, even crema foam, a future signature of an excellent espresso. This allowed the Gaggia company (founded in 1948) to commercialize the espresso machines as a consumer product for use in bars. There was about a decade span between the original 1938 patent and its 1949 industrial implementation.

Original espresso maker

Around 1950, espresso machines began crossing Italian borders to the United Kingdom, America and Africa. This is when the first large spike happens in the use of the word espresso in the English language. The spike and following rapid growth are evident from the historic WordFrequencyData of published English corpora plotted across the 20th century:

history[w_] :=   WordFrequencyData[w, "TimeSeries", {1900, 2000}, IgnoreCase -> True]

The function above gets TimeSeries data for the frequencies of words w in a fixed time range from 1900–2000 that, of course, can be extended if needed. The data can be promptly visualized with DateListPlot:

DateListPlot[history[{"espresso", "expresso"}], PlotRange -> All,   PlotTheme -> "Wide"]

The much less frequent expresso also gains its popularity slowly but steadily. Its simultaneous growth is more obvious with the log-scaled vertical frequency axis. To be able to easily switch between log and regular scales and also improve the visual comprehension of multiple plots, I will define a function:

vkWordFreqPlot[list_, plot_] :=    plot[MovingAverage[#, 3] & /@      WordFrequencyData[list, "TimeSeries", {1900, 2000},       IgnoreCase -> True], PlotTheme -> "Detailed", AspectRatio -> 1/3,     Filling -> Bottom, PlotRange -> All, InterpolationOrder -> 2,     PlotLegends -> Placed[Automatic, {Left, Top}]];

The plot below also compares the espresso/expresso pair to a typical pair acknowledged by dictionaries, unfocused/unfocussed, stemming from American/British usage:
vkWordFreqPlot[{"espresso", "expresso", "unfocused",    "unfocussed"}, DateListLogPlot]

The overall temporal behavior of frequencies for these two pairs is quite similar, as it is for many other words of alternative orthography acknowledged by dictionaries. So why is espresso/expresso so controversial? A good historical account is given by Slate Magazine, which, as does Merriam-Webster, supports the official endorsement of expresso. And while both articles give a clear etymological reasoning, the important argument for expresso is its persistent frequent usage (even in such distinguished publications as The New York Times). As it stands as of the date of this blog, the following lexicographic vote has been cast in support of expresso by some selected trusted sources I scanned through. Aye: Merriam-Webster online, Harper Collins online, Random House online. Nay: Cambridge Dictionary online, Oxford Learner’s Dictionaries online, Oxford Dictionaries online (“The spelling expresso is not used in the original Italian and is strictly incorrect, although it is common”; see also the relevant blog), Garner’s Modern American Usage, 3rd edition (“Writers frequently use the erroneous form [expresso]”).

In times of dividing lines, data helps us to refocus on the whole picture and dominant patterns. To stress diversity of alternative spellings, consider the pair amok/amuck:

vkWordFrequencyPlot[{"amok", "amuck"}, DateListPlot]

Of a rather macabre origin, amok came to English around the mid-1600s from the Malay amuk, meaning “murderous frenzy,” referring to a psychiatric disorder of a manic urge to murder. The pair amok/amuck has interesting characteristics. Both spellings can be found in dictionaries. The WordFrequencyData above shows the rich dynamics of oscillating popularity, followed by the competitive rival amuck becoming the underdog. The difference in orthography does not have a typical British/American origin, which should affect how alternative spellings are sampled for statistical analysis further below. And finally, the Levenshtein EditDistance is not equal to 1…

EditDistance["amok", "amuck"]

… in contrast to many typical cases such as:

EditDistance @@@ {{"color", "colour"}, {"realize",     "realise"}, {"aesthetic", "esthetic"}}

This will also affect the sampling of data. My goal is to extract from a dictionary a data sample large enough to describe the diversity of alternatively spelled words that are also structurally close to the espresso/expresso pair. If the basic statistics of this sample assimilate the espresso/expresso pair well, then it quantifies and confirms Merriam-Webster’s assertion that “expresso shows enough use in English to be entered in the dictionary.” But it also goes a step further, because now all pairs from the dictionary sample can be considered as precedents for legitimizing expresso.

Dictionary as Data

Alternative spellings come in pairs and should not be considered separately because there is statistical information in their relation to each other. For instance, the word frequency of expresso should not be compared with the frequency of an arbitrary word in a dictionary. Contrarily, we should consider an alternative spelling pair as a single data point with coordinates {f+, f} denoting higher/lower word frequency of more/less popular spelling correspondingly, and always in that order. I will use the weighted average of a word frequency over all years and all data corpora. It is a better overall metric than a word frequency at a specific date, and avoids the confusion of a frequency changing its state between higher f+ and lower f at different time moments (as we saw for amok/amuck). Weighted average is the default value of WordFrequencyData when no date is specified as an argument.

The starting point is a dictionary that is represented in the Wolfram Language by WordList and contains 84,923 definitions:

Length[words = WordList["KnownWords"]]

There are many types of dictionaries with quite varied sizes. There is no dictionary in the world that contains all words. And, in fact, all dictionaries are outdated as soon as they are published due to continuous language evolution. My assumption is that the exact size or date of a dictionary is unimportant as long as it is “modern and large enough” to produce a quality sample of spelling variants. The curated built-in data of the Wolfram Language, such as WordList, does a great job at this.

We notice right away that language is often prone to quite simple laws and patterns. For instance, it is widely assumed that lengths of words in an English dictionary…

Histogram[StringLength[words], Automatic, "PDF",   PlotTheme -> "Detailed", PlotRange -> All]

… follow quite well one of the simplest statistical distributions, the PoissonDistribution. The Wolfram Language machine learning function FindDistribution picks up on that easily:


Show[%%, DiscretePlot[PDF[%, k], {k, 0, 33}, Joined -> True]]

My goal is to search for such patterns and laws in the sample of alternative spellings. But first they need to be extracted from the dictionary.

Extracting Spelling Variants

For ease of data processing and analysis, I will make a set of simplifications. First of all, only the following basic parts of speech are considered to bring data closer to the espresso/expresso case:

royalTypes = {"Noun", "Adjective", "Verb", "Adverb"};

This reduces the dictionary to 84,487 words:

royals = DeleteDuplicates[    Flatten[WordList[{"KnownWords", #}] & /@ royalTypes]]; Length[royals]

Deletion of duplicates is necessary, because the same word can be used as several parts of speech. Further, the words containing any characters beyond the lowercase English alphabet are excluded:

outlaws = Complement[Union[Flatten[Characters[words]]], Alphabet[]]

This also removes all proper names, and drops the number of words to 63,712:

laws = Select[royals, ! StringContainsQ[#, outlaws] &]; Length[laws]

Every word is paired with the list of its definitions, and every list of definitions is sorted alphabetically to ensure exact matches in determining alternative spellings:

Define[w_] := w -> Sort[WordDefinition[w]]; defs = Define /@ laws;

Next, words are grouped by their definitions; single-word groups are removed, and definitions themselves are removed too. The resulting dataset contains 8,138 groups:

samedefs =   Replace[GatherBy[defs, Last], {_ -> _} :> Nothing, 1][[All, All, 1]]


Different groups of words with the same definition have a variable number of words n ≥ 2…

Framed[TableForm[Transpose[groups = Sort[Tally[Length /@ samedefs]]],    TableHeadings -> {groupsHead = {"words, n", "groups, m"}, None},    TableSpacing -> {1, 2}]]

… where m is the number of groups. They follow a remarkable power law. Very roughly for order for magnitudes m~200000 n-5.

Show[ListLogLogPlot[groups, PlotTheme -> "Business",    FrameLabel -> groupsHead],  Plot[Evaluate[Fit[Log[groups], {1, x}, x]], {x, Log[2], Log[14]},    PlotStyle -> Red]]

Close synonyms are often grouped together:

Select[samedefs, Length[#] == 10 &]

This happens because WordDefinition is usually quite concise:

WordDefinition /@ {"abjure", "forswear", "recant"}

To separate synonyms from alternative spellings, I could use heuristics based on orthographic rules formulated for classes such as British versus American English. But that would be too complex and unnecessary. It is much easier to consider only word pairs that differ by a small Levenshtein EditDistance. It is highly improbable for synonyms to differ by just a few letters, especially a single one. So while this excludes not only synonyms but also alternative spellings such as amok/amuck, it does help to select words closer to espresso/expresso and hopefully make the data sample more uniform. The computations can be easily generalized to a larger Levenshtein EditDistance, but it would be important and interesting to first check the most basic case:

EditOne[l_] :=    l[[#]] & /@ Union[Sort /@ Position[Outer[EditDistance, l, l], 1]]; samedefspair = Flatten[EditOne /@ samedefs, 1]

This reduces the sample size to 2,882 pairs:


Mutations of Spellings

Alternative spellings are different orthographic states of the same word that have different probabilities of occurrence in the corpora. They can inter-mutate based on the context or environment they are embedded into. Analysis of such mutations seems intriguing. The mutations can be extracted with help of the SequenceAlignment function. It is based on algorithms from bioinformatics identifying regions of similarity in DNA, RNA or protein sequences, and often wandering into other fields such as linguistics, natural language processing and even business and marketing research. The mutations can be between two characters or a character and a “hole” due to character removal or insertion:

SequenceAlignment @@@ {{"color", "colour"}, {"mesmerise",     "mesmerize"}}

In the extracted mutations’ data, the “hole” is replaced by a dash (-) for visual distinction:

mutation =   Cases[SequenceAlignment @@@ samedefspair, _List, {2}] /. "" -> "-"

The most probable letters to participate in a mutation between alternative spellings can be visualized with Tally. The most popular letters are s and z thanks to the British/American endings -ise/-ize, surpassed only by the popularity of the “hole.” This probably stems from the fact that dropping letters often makes orthography and phonetics easier.

vertex = Association[Rule @@@ SortBy[Tally[Flatten[mutation]], Last]]; optChart = {ColorFunction -> "Rainbow", BaseStyle -> 15,     PlotTheme -> "Web"}; inChar = PieChart[vertex, optChart, ChartLabels -> Callout[Automatic],     SectorOrigin -> -Pi/9]; BarChart[Reverse[vertex], optChart, ChartLabels -> Automatic,  Epilog -> Inset[inChar, Scaled[{.6, .5}], Automatic, Scaled[1.1]]]

Querying Word Frequencies

The next step is to get the WordFrequencyData for all
2 x 2882 = 5764 words of alternative spelling stored in the variable samedefspair. WordFrequencyData is a very large dataset, and it is stored on Wolfram servers. To query frequencies for a few thousands words efficiently, I wrote some special code that can be found in the notebook attached at the end of this blog. The resulting data is an Association containing alternative spellings with ordered pairs of words as keys and ordered pairs of frequencies as values. The higher-frequency entry is always first:


The size of the data is slightly less than the original queried set because for some words, frequencies are unknown:

{Length[data], Length[samedefspair] - Length[data]}

Basic Analysis

Having obtained the data, I am now ready to check how well the frequencies of espresso/expresso fall within this data:

esex = Values[   WordFrequencyData[{"espresso", "expresso"}, IgnoreCase -> True]]

As a start, I will examine if there are any correlations between lower and higher frequencies. Pearson’s Correlation coefficient, a measure of the strength of the linear relationship between two variables, gives a high value for lower versus higher frequencies:

Correlation @@ Transpose[Values[data]]

But plotting frequency values at their natural scale hints that a log scale could be more appropriate:

ListPlot[Values[data], AspectRatio -> Automatic,   PlotTheme -> "Business", PlotRange -> All]

And indeed for log-values of frequencies, the Correlation strength is significantly higher:

Correlation @@ Transpose[Log[Values[data]]]

Fitting the log-log of data reveals a nice linear fit…

lmf = LinearModelFit[Log[Values[data]], x, x]; lmf["BestFit"]

… with sensible statistics of parameters:


In the frequency space, this shows a simple and quite remarkable power law that sheds light on the nature of correlations between the frequencies of less and more popular spellings of the same word:

Reduce[Log[SubMinus[f]] == lmf["BestFit"] /.    x -> Log[SubPlus[f]], SubMinus[f], Reals]

Log-log space gives a clear visualization of the data. Obviously due to {greater, smaller} sorting of coordinates {f+, f}, all data points cannot exceed the Log[f]==Log[f+] limiting orange line. The purple line is the linear fit of the power law. The red circle is the median of the data, and the red dot is the value of the espresso/expresso frequency pair:
ListLogLogPlot[data, PlotRange -> All, AspectRatio -> Automatic,   PlotTheme -> "Detailed",  		ImageSize -> 800, Epilog -> {{Purple, Thickness[.004], Opacity[.4],     	Line[Transpose[{{-30, 0}, Normal[lmf] /. x -> {-30, 0}}]]},    		{Orange, Thickness[.004], Opacity[.4],      Line[{-30 {1, 1}, -10 {1, 1}}]},    		{Red, Opacity[.5], PointSize[.02], Point[Log[esex]]},    		{Red, Opacity[.5], Thickness[.01],      Circle[Median[Log[Values[data]]], .2]}}]

A simple, useful transformation of the coordinate system will help our understanding of the data. Away from log-frequency vs. log-frequency space we go. The distance from a data point to the orange line Log[f]==Log[f+] is the measure of how many times larger the higher frequency is than the lower. It is given by a linear transformation—rotation of the coordinate system by 45 degrees. Because this distance is given by difference of logs, it relates to the ratio of frequencies:


This random variable is well fit by the very famous and versatile WeibullDistribution, which is used universally for weather forecasting to describe wind speed distributions; survival analysis; reliability, industrial and electrical engineering; extreme value theory; forecasting technological change; and much more—including, now, word frequencies:

dist = FindDistribution[   trans = (#1 - #2)/Sqrt[2] & @@@ Log[Values[data]]]

One of the most fascinating facts is “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,” which is the title of a 1960 paper by the physicist Eugene Wigner. One of its notions is that mathematical concepts often apply uncannily and universally far beyond the context in which they were originally conceived. We might have glimpsed at that in our data.

Using statistical tools, we can figure out that in the original space the frequency ratio obeys a distribution with a nice analytic formula:

Assuming[SubPlus[f]/SubMinus[f] > 1,   PDF[TransformedDistribution[E^(Sqrt[2] u),     u \[Distributed] WeibullDistribution[a, b]], SubPlus[f]/SubMinus[   f]]]

It remains to note that the other corresponding transformed coordinate relates to the frequency product…

TraditionalForm[PowerExpand[Log[(SubPlus[f] SubMinus[f])^2^(-1/2)]]]

… and is the position of a data point along the orange line Log[f]==Log[f+]. It reflects how popular, on average, a specific word pair is among other pairs. One can see that the espresso/expresso value lands quite above the median, meaning the frequency of its usage is higher than half of the data points.

Nearest can find the closest pairs to espresso/expresso measured by EuclideanDistance in the frequency space. Taking a look at the 50 nearest pairs shows just how typical the frequencies espresso/expresso are, shown below by a red dot. Many nearest neighbors, such as energize/energise and zombie/zombi, belong to the basic everyday vocabulary of most frequent usage:

neighb = Nearest[data, esex, 50]; ListPlot[Association @@ Thread[neighb -> data /@ neighb],  	PlotRange -> All, AspectRatio -> Automatic, PlotTheme -> "Detailed",  	Epilog -> {{Red, Opacity[.5], PointSize[.03], Point[esex]}}]

The temporal behavior of frequencies for a few nearest neighbors shows significant diversity and often is generally reminiscent of such behavior for the espresso/expresso pair that was plotted at the beginning of this article:

Multicolumn[vkWordFreqPlot[#, DateListPlot] & /@ neighb[[;; 10]], 2]

Networks of Mutation

Frequencies allow us to define a direction of mutation, which can be visualized by a DirectedEdge always pointing from lower to higher frequency. A Tally of the edges defines weights (or not-normalized probabilities) of particular mutations.

muteWeigh =    Tally[Cases[SequenceAlignment @@@ Keys[data], _List, {2}] /.      "" -> "-"]; edge = Association[Rule @@@ Transpose[{       DirectedEdge @@ Reverse[#] & /@ muteWeigh[[All, 1]],        N[Rescale[muteWeigh[[All, 2]]]]}]];

For clarity of visualization, all edges with weights less than 10% of the maximum value are dropped. The most popular mutation is sz->1, with maximum weight 1. It is interesting to note that reverse mutations might occur too; for instance, zs->0.0347938, but much less often:

cutEdge = ReverseSort[ Select[edge, # > .01 &]]

PieChart[cutEdge, optChart, ChartLabels -> Callout[Automatic]]

Thus a letter can participate in several types of mutations, and in this sense mutations form a network. The size of the vertex is correlated with the probability of a letter to participate in any mutation (see the variable vertex above):

vs = Thread[Keys[vertex] -> 2 N[.5 + Rescale[Values[vertex]]]];

The larger the edge weight, the brighter the edge:

es = Thread[    Keys[cutEdge] -> (Directive[Thickness[.003], Opacity[#]] & /@        N[Values[cutEdge]^.3])];

The letters r and g participate mostly in the deletion mutation. Letters with no edges participate in very rare mutations.

graphHighWeight =   Graph[Keys[vertex], Keys[cutEdge], PerformanceGoal -> "Quality",   VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 15,    VertexSize -> vs, EdgeStyle -> es]

Among a few interesting substructures, one of the obvious is the high clustering of vowels. A Subgraph of vowels can be easily extracted…

vowels = {"a", "e", "i", "o", "u"}; Subgraph[graphHighWeight, vowels, GraphStyle -> "SmallNetwork"]

… and checked for completeness, which yields False due to many missing edges from and to u:


Nevertheless, as you might remember, the low-weight edges were dropped for a better visual of high-weight edges. Are there any interesting observations related to low-weight edges? As a matter of fact, yes, there are. Let’s quickly rebuild a full subgraph for only vowels. Vertex sizes are still based on the tally of letters in mutations:

vowelsVertex =   Association @@    Cases[Normal[vertex], Alternatives @@ (# -> _ & /@ vowels)]

vsVow = Thread[    Keys[vowelsVertex] -> .2 N[.5 + Rescale[Values[vowelsVertex]]]];

All mutations of vowels in the dictionary can be extracted with the help of MemberQ:

vowelsMute =    Select[muteWeigh, And @@ (MemberQ[vowels, #] & /@ First[#]) &]; vowelsEdge = Association[Rule @@@    Transpose[     MapAt[DirectedEdge @@ Reverse[#] & /@ # &, Transpose[vowelsMute],       1]]]

In order to visualize exactly the number of vowel mutations in the dictionary, the edge style is kept uniform and edge labels are used for nomenclature:

vowelGraph = Graph[Keys[vowelsVertex], Keys[vowelsEdge],   EdgeWeight -> vowelsMute[[All, 2]], PerformanceGoal -> "Quality",    VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 20,    VertexSize -> vsVow, EdgeLabels -> "EdgeWeight",    EdgeLabelStyle -> Directive[15, Bold]]

And now when we consider all (even small-weight) mutations, the graph is complete:


But this completeness is quite “weak” in the sense that there are many edges with a really small weight, in particular two edges with weight 1:

Select[vowelsMute, Last[#] == 1 &]

This means that there is only one alternative word pair for eu mutations, and likewise for io mutations. With the help of a lookup function…

lookupMute[l_] := With[{keys = Keys[data]}, keys[[Position[       SequenceAlignment @@@ keys /. "" -> "-",        Alternatives @@ l, {2}][[All, 1]]]]]

… these pairs can be found as:

lookupMute[{{"o", "i"}, {"u", "e"}}]

Thus, thanks to these unique and quite exotic words, our dictionaries have eu and io mutations. Let’s check WordDefinition for these terms:

TableForm[WordDefinition /@ #,     TableHeadings -> {#, None}] &@{"corticofugal", "yarmulke"}

The word yarmulke is a quite curious case. First of all, it has three alternative spellings:

Nearest[WordData[], "yarmulke", {All, 1}]

Additionally, the Merriam-Webster Dictionary suggests a rich etymology: “Yiddish yarmlke, from Polish jarmu?ka & Ukrainian yarmulka skullcap, of Turkic origin; akin to Turkish ya?murluk rainwear.” The Turkic class of languages is quite wide:

EntityList[EntityClass["Language", "Turkic"]]

Together with the other mentioned languages, Turkic languages mark a large geographic area as the potential origin and evolution of the word yarmulke:

locs = DeleteDuplicates[Flatten[EntityValue[     {EntityClass["Language", "Turkic"],       EntityClass["Language", "Yiddish"], Entity["Language", "Polish"],       Entity["Language", "Ukrainian"]},      EntityProperty["Language", "PrimaryOrigin"]]]]

GeoGraphics[GeoMarker[locs, "Scale" -> Scaled[.03]],   GeoRange -> "World", GeoBackground -> "Coastlines",   GeoProjection -> "WinkelTripel"]

This evolution has Yiddish as an important stage before entering English, while Yiddish itself has a complex cultural history. English usage of yarmulke spikes around 1940–1945, hence World War II and the consequent Cold War era are especially important in language migration, correlated probably to the world migration and changes in Jewish communities during these times.

vkWordFreqPlot[{"yarmulke", "yarmelke", "yarmulka"}, DateListLogPlot]

These complex processes brought many more Yiddish words to English (my personal favorites are golem and glitch), but only a single one resulted in the introduction of the mutation eu in the whole English dictionary (at least within our dataset). So while there are really no sx mutations currently in English (as in espresso/expresso), this is not a negative indicator because there are cases of mutations that are unique to a single or just a few words. And actually, there are many more such mutations with a small weight than with a large weight:

ListLogLogPlot[Sort[Tally[muteWeigh[[All, 2]]]],   PlotTheme -> "Detailed",  PlotRange -> All,   FrameLabel -> {"mutation weight", "number of weights"},   Epilog -> Text[Style["s" \[DirectedEdge] "z", 15], Log@{600, 1.2}],   Filling -> Bottom]

So while the sz mutation happens in 777 words, it is the only mutation with that weight:

MaximalBy[muteWeigh, Last]

On the other hand, there are 61 unique mutations that happen only once in a single word, as can be seen from the plot above. So in this sense, the most weighted sz mutation is an outlier, and if expresso enters a dictionary, then the espresso/expresso pair will join the majority of unique mutations with weight 1. These are the mutation networks for the first four small weights:

vkWeight[n_] := Select[muteWeigh, Last[#] == n &][[All, 1]] vkMutationNetwork[n_] :=   Graph[DirectedEdge @@ Reverse[#] & /@ vkWeight[n],   VertexLabels -> Placed[Automatic, Center], VertexLabelStyle -> 15,   VertexSize -> Scaled[.07], AspectRatio -> 1,    PerformanceGoal -> "Quality",   PlotLabel -> "Mutation Weight = " <> ToString[n]] Grid[Partition[vkMutationNetwork /@ Range[4], 2], Spacings -> {1, 1},   Frame -> All]

As the edge weight gets larger, networks become simpler—degenerating completely for very large weights. Let’s examine a particular set of mutations with a small weight—for instance, weight 2:

DirectedEdge @@ Reverse[#] & /@   Select[muteWeigh, Last[#] == 1 &][[All, 1]]

This means there are only two unique alternative spellings (four words) for each mutation out of the whole dictionary:

Multicolumn[  Row /@ Replace[    SequenceAlignment @@@ (weight2 = lookupMute[vkWeight[2]]) /.      "" -> "-", {x_, y_} :> Superscript[x, Style[y, 13, Red]], {2}], 4]

Red marks a less popular letter, printed as a superscript of the more popular one. While the majority of these pairs are truly alternative spellings with a sometimes curiously dynamic history of usage…

vkWordFreqPlot[{"fjord", "fiord"}, DateListPlot]

… some occasional pairs, like distrust/mistrust, indicate blurred lines between alternative spellings and very close synonyms with close orthographic forms—here the prefixes mis- and dis-. Such rare situations can be considered as a source of noise in our data if someone does not want to accept them as true alternative spellings. My personal opinion is that the lines are blurred indeed, as the prefixes mis- and dis- themselves can be considered alternative spellings of the same semantic notion.

These small-weight mutations (white dots in the graph below) are distributed among the rest of the data (black dots) really well, which reflects on their typicality. This can be visualized by constructing a density distribution with SmoothDensityHistogram, which uses SmoothKernelDistribution behind the scenes:

SmoothDensityHistogram[Log[Values[data]],  Mesh -> 50, ColorFunction -> "DarkRainbow", MeshStyle -> Opacity[.2],  PlotPoints -> 200, PlotRange -> {{-23, -11}, {-24, -12}}, Epilog -> {    {Black, Opacity[.4], PointSize[.002], Point[Log[Values[data]]]},    {White, Opacity[.7], PointSize[.01],      Point[Log[weight2 /. Normal[data]]]},    {Red, Opacity[1], PointSize[.02], Point[Log[esex]]},    {Red, Opacity[1], Thickness[.01],      Circle[Median[Log[Values[data]]], .2]}}]

Some of these very exclusive, rare alternative spellings are even more or less frequently used than espresso/expresso, as shown above for the example of weight 2, and can be also shown for other weights. Color and contour lines provide a visual guide for where the values of density of data points lie.


The following factors affirm why expresso should be allowed as a valid alternative spelling.

  • Espresso/expresso falls close to the median usage frequencies of 2,693 official alternative spellings with Levenshtein EditDistance equal to 1
  • The frequency of espresso/expresso usage as whole pair is above the median, so it is more likely to be found in published corpora than half of the examined dataset
  • Many nearest neighbors of espresso/expresso in the frequency space belong to a basic vocabulary of the most frequent everyday usage
  • The history of espresso/expresso usage in English corpora shows simultaneous growth for both spellings, and by temporal pattern is reminiscent of many other official alternative spellings
  • The uniqueness of the sx mutation in the espresso/expresso pair is typical, as numerous other rare and unique mutations are officially endorsed by dictionaries

So all in all, it is ultimately up to you how to interpret this analysis or spell the name of the delightful Italian drink. But if you are a wisenheimer type, you might consider being a tinge more open-minded. The origin of words, as with the origin of species, has its dark corners, and due to inevitable and unpredictable language evolution, one day your remote descendants might frown on the choice of s in espresso.

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download. If you would like to change parameters to make your own data exploration, download the full notebook.

Комментарии (5)

4. How to Win at Risk: Exact ProbabilitiesПн., 20 нояб.[−]

The classic board game Risk involves conquering the world by winning battles that are played out using dice. There are lots of places on the web where you can find out the odds of winning a battle given the number of armies that each player has. However, all the ones that I have seen do this by Monte Carlo simulation, and so are innately approximate. The Wolfram Language makes it so easy to work out the exact values that I couldn’t resist calculating them once and for all.

Risk battle odds flow chart

Here are the basic battle rules: the attacker can choose up to three dice (but must have at least one more army than dice), and the defender can choose up to two (but must have at least two armies to use two). To have the best chances of winning, you always use the most dice possible, so I will ignore the other cases. Both players throw simultaneously and then the highest die from each side is paired, and (if both threw at least two dice) the next highest are paired. The highest die kills an army and, in the event of a draw, the attacker is the loser. This process is repeated until one side runs out of armies.

So my goal is to create a function pBattle[a,d] that returns the probability that the battle ends ultimately as a win for the attacker, given that the attacker started with a armies and the defender started with d armies.

I start by coding the basic game rules. The main case is when both sides have enough armies to fight with at least two dice. There are three possible outcomes for a single round of the battle. The attacker wins twice or loses twice, or both sides lose one army. The probability of winning the battle is therefore the sum of the probabilities of winning after the killed armies are removed multiplied by the probability of that outcome.

pBattle[a_, d_] /; (a >= 3 && d >= 2) := Once[    pWin2[a, d] pBattle[a, d - 2] +      pWin1Lose1[a, d] pBattle[a - 1, d - 1] +      pLose2[a, d] pBattle[a - 2, d]    ];

We also have to cover the case that either side has run low on armies and there is only one game piece at stake.

pBattle[a_, d_] /; (a > 1 && d >= 1) := Once[    pWin1[a, d] pBattle[a, d - 1] + pLose1[a, d] pBattle[a - 1, d]    ];

This sets up a recursive definition that defines all our battle probabilities in terms of the probabilities of subsequent stages of the battle. Once prevents us working those values out repeatedly. We just need to terminate this recursion with the end-of-battle rules. If the attacker has only one army, he has lost (since he must have more armies than dice), so our win probability is zero. If our opponent has run out of armies, then the attacker has won.

pBattle[1, _] = 0; pBattle[_, 0] = 1;

Now we have to work out the probabilities of our five individual attack outcomes: pWin2, pWin1Lose1, pLose2, pWin1 and pLose1.

When using two or three dice, we can describe the distribution as an OrderDistribution of a DiscreteUniformDistribution because we always want to pair the highest throws together.

diceDistribution[n : 3 | 2] :=    OrderDistribution[{DiscreteUniformDistribution[{1, 6}], n}, {n - 1,      n}];

For example, here is one outcome of that distribution; the second number will always be the largest, due to the OrderDistribution part.


The one-die case is just a uniform distribution; our player has to use the value whether it is good or not. However, for programming convenience, I am going to describe a distribution of two numbers, but we will never look at the first.

diceDistribution[1] := DiscreteUniformDistribution[{{1, 6}, {1, 6}}];

So now the probability of winning twice is that both attacker dice are greater than both defenders. The defender must be using two dice, but the attacker could be using two or three.

pWin2[a_, d_] /; a >= 3 && d >= 2 := Once[    Probability[     a1 > d1 &&       a2 > d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[2]}]    ];

The lose-twice probability has a similar definition.

pLose2[a_, d_] := Once[    Probability[     a1 <= d1 &&       a2 <= d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[2]}]    ];

And the draw probability is what’s left.

pWin1Lose1[a_, d_] := Once[1 - pWin2[a, d] - pLose2[a, d]]

The one-army battle could be because the attacker is low on armies or because the defender is. Either way, we look only at the last value of our distributions.

pWin1[a_, d_] /; a === 2 || d === 1 := Once[    Probability[     a2 > d2, {{a1, a2} \[Distributed]        diceDistribution[Min[a - 1, 3]], {d1, d2} \[Distributed]        diceDistribution[Min[d, 2]]}]    ];

And pLose1 is just the remaining case.

pLose1[a_, d_] := 1 - pWin1[a, d];

And we are done. All that is left is to use the function. Here is the exact (assuming fair dice, and no cheating!) probability of winning if the attacker starts with 18 armies and the defender has only six.

pBattle[18, 6]

We can approximate this to 100 decimal places.

N[%, 100]

We can quickly enumerate the probabilities for lots of different starting positions.

table = Text@   Grid[Prepend[     Table[Prepend[Table[pBattle[a, d], {d, 1, 4}],        StringForm["Attack with " <> ToString[a]]], {a, 2, 16}],     Prepend[      Table[StringForm["Defend with " <> ToString[n]], {n, 1, 4}],       ""]], Frame -> All, FrameStyle -> LightGray]

Here are the corresponding numeric values to only 20 decimal places.

N[table, 20]

You can download tables of more permutations here, with exact numbers, and here, approximated to 20 digits.

Of course, this level of accuracy is rather pointless. If you look at the 23 vs. 1 battle, the probability of losing is about half the probability that you will actually die during the first throw of the dice, and certainly far less than the chances of your opponent throwing the board in the air and refusing to play ever again.

Appendix: Code for Generating the Outcomes Graph

vf[{x_, y_}, name_, {w_, h_}] := {Black, Circle[{x, y}, w], Black,     Text[If[StringQ[name], Style[name, 12],       Style[Row[name, "\[ThinSpace]vs\[ThinSpace]"], 9]], {x, y}]}; edge[e_, th_] :=    Property[e, EdgeStyle -> {Arrowheads[th/15], Thickness[th/40]}]; Graph[Flatten[Table[If[a >= 3 && d >= 2,      {       edge[{a, d} -> {a, d - 2}, pWin2[a, d]],       edge[{a, d} -> {a - 1, d - 1}, pWin1Lose1[a, d]],       edge[{a, d} -> {a - 2, d}, pLose2[a, d]]              },      {       edge[{a, d} -> {a, d - 1}, pWin1[a, d]],       edge[{a, d} -> {a - 1, d}, pLose1[a, d]]              }], {a, 2, 6}, {d, 1, 4}]] /. {{a_, 0} -> "Win", {1, d_} ->      "Lose"}, ImageSize -> Full, VertexShapeFunction -> vf,   VertexSize -> 1]

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

Комментарии (1)

5. What Is a Computational Essay?Вт., 14 нояб.[−]

A Powerful Way to Express Ideas

People are used to producing prose—and sometimes pictures—to express themselves. But in the modern age of computation, something new has become possible that I’d like to call the computational essay.

I’ve been working on building the technology to support computational essays for several decades, but it’s only very recently that I’ve realized just how central computational essays can be to both the way people learn, and the way they communicate facts and ideas. Professionals of the future will routinely deliver results and reports as computational essays. Educators will routinely explain concepts using computational essays. Students will routinely produce computational essays as homework for their classes.

Here’s a very simple example of a computational essay:

Simple computational essay example

There are basically three kinds of things here. First, ordinary text (here in English). Second, computer input. And third, computer output. And the crucial point is that these all work together to express what’s being communicated.

The ordinary text gives context and motivation. The computer input gives a precise specification of what’s being talked about. And then the computer output delivers facts and results, often in graphical form. It’s a powerful form of exposition that combines computational thinking on the part of the human author with computational knowledge and computational processing from the computer.

But what really makes this work is the Wolfram Language—and the succinct representation of high-level ideas that it provides, defining a unique bridge between human computational thinking and actual computation and knowledge delivered by a computer.

In a typical computational essay, each piece of Wolfram Language input will usually be quite short (often not more than a line or two). But the point is that such input can communicate a high-level computational thought, in a form that can readily be understood both by the computer and by a human reading the essay.

It’s essential to all this that the Wolfram Language has so much built-in knowledge—both about the world and about how to compute things in it. Because that’s what allows it to immediately talk not just about abstract computations, but also about real things that exist and happen in the world—and ultimately to provide a true computational communication language that bridges the capabilities of humans and computers.

An Example

Let’s use a computational essay to explain computational essays.

Let’s say we want to talk about the structure of a human language, like English. English is basically made up of words. Let’s get a list of the common ones.

Generate a list of common words in English:



How long is a typical word? Well, we can take the list of common words, and make a histogram that shows their distribution of lengths.

Make a histogram of word lengths:



Do the same for French:

Histogram[StringLength[WordList[Language -> "French"]]]

Histogram[StringLength[WordList[Language -> "French"]]]

Notice that the word lengths tend to be longer in French. We could investigate whether this is why documents tend to be longer in French than in English, or how this relates to quantities like entropy for text. (Of course, because this is a computational essay, the reader can rerun the computations in it themselves, say by trying Russian instead of French.)

But as something different, let’s compare languages by comparing their translations for, say, the word “computer”.

Find the translations for “computer” in the 10 most common languages:

Take[WordTranslation["computer", All], 10]

Take[WordTranslation["computer", All], 10]

Find the first translation in each case:

First /@ Take[WordTranslation["computer", All], 10]

First /@ Take[WordTranslation["computer", All], 10]

Arrange common languages in “feature space” based on their translations for “computer”:

FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

From this plot, we can start to investigate all sorts of structural and historical relationships between languages. But from the point of view of a computational essay, what’s important here is that we’re sharing the exposition between ordinary text, computer input, and output.

The text is saying what the basic point is. Then the input is giving a precise definition of what we want. And the output is showing what’s true about it. But take a look at the input. Even just by looking at the names of the Wolfram Language functions in it, one can get a pretty good idea what it’s talking about. And while the function names are based on English, one can use “ code captions” to understand it in another language, say Japanese:


FeatureSpacePlot[First /@ Take[WordTranslation["computer", All], 40]]

But let’s say one doesn’t know about FeatureSpacePlot. What is it? If it was just a word or phrase in English, we might be able to look in a dictionary, but there wouldn’t be a precise answer. But a function in the Wolfram Language is always precisely defined. And to know what it does we can start by just looking at its documentation. But much more than that, we can just run it ourselves to explicitly see what it does.

FeatureSpacePlot page

And that’s a crucial part of what’s great about computational essays. If you read an ordinary essay, and you don’t understand something, then in the end you really just have to ask the author to find out what they meant. In a computational essay, though, there’s Wolfram Language input that precisely and unambiguously specifies everything—and if you want to know what it means, you can just run it and explore any detail of it on your computer, automatically and without recourse to anything like a discussion with the author.


How does one actually create a computational essay? With the technology stack we have, it’s very easy—mainly thanks to the concept of notebooks that we introduced with the first version of Mathematica all the way back in 1988. A notebook is a structured document that mixes cells of text together with cells of Wolfram Language input and output, including graphics, images, sounds, and interactive content:

A typical notebook

In modern times one great (and very hard to achieve!) thing is that full Wolfram Notebooks run seamlessly across desktop, cloud and mobile. You can author a notebook in the native Wolfram Desktop application (Mac, Windows, Linux)—or on the web through any web browser, or on mobile through the Wolfram Cloud app. Then you can share or publish it through the Wolfram Cloud, and get access to it on the web or on mobile, or download it to desktop or, now, iOS devices.

Notebook environments

Sometimes you want the reader of a notebook just to look at it, perhaps opening and closing groups of cells. Sometimes you also want them to be able to operate the interactive elements. And sometimes you want them to be able to edit and run the code, or maybe modify the whole notebook. And the crucial point is that all these things are easy to do with the cloud-desktop-mobile system we’ve built.

A New Form of Student Work

Computational essays are great for students to read, but they’re also great for students to write. Most of the current modalities for student work are remarkably old. Write an essay. Give a math derivation. These have been around for millennia. Not that there’s anything wrong with them. But now there’s something new: write a computational essay. And it’s wonderfully educational.

A computational essay is in effect an intellectual story told through a collaboration between a human author and a computer. The computer acts like a kind of intellectual exoskeleton, letting you immediately marshall vast computational power and knowledge. But it’s also an enforcer of understanding. Because to guide the computer through the story you’re trying to tell, you have to understand it yourself.

When students write ordinary essays, they’re typically writing about content that in some sense “already exists” (“discuss this passage”; “explain this piece of history”; …). But in doing computation (at least with the Wolfram Language) it’s so easy to discover new things that computational essays will end up with an essentially inexhaustible supply of new content, that’s never been seen before. Students will be exploring and discovering as well as understanding and explaining.

When you write a computational essay, the code in your computational essay has to produce results that fit with the story you’re telling. It’s not like you’re doing a mathematical derivation, and then some teacher tells you you’ve got the wrong answer. You can immediately see what your code does, and whether it fits with the story you’re telling. If it doesn’t, well then maybe your code is wrong—or maybe your story is wrong.

What should the actual procedure be for students producing computational essays? At this year’s Wolfram Summer School we did the experiment of asking all our students to write a computational essay about anything they knew about. We ended up with 72 interesting essays—exploring a very wide range of topics.

In a more typical educational setting, the “prompt” for a computational essay could be something like “What is the typical length of a word in English” or “Explore word lengths in English”.

There’s also another workflow I’ve tried. As the “classroom” component of a class, do livecoding (or a live experiment). Create or discover something, with each student following along by doing their own computations. At the end of the class, each student will have a notebook they made. Then have their “homework” be to turn that notebook into a computational essay that explains what was done.

And in my experience, this ends up being a very good exercise—that really tests and cements the understanding students have. But there’s also something else: when students have created a computational essay, they have something they can keep—and directly use—forever.

And this is one of the great general features of computational essays. When students write them, they’re in effect creating a custom library of computational tools for themselves—that they’ll be in a position to immediately use at any time in the future. It’s far too common for students to write notes in a class, then never refer to them again. Yes, they might run across some situation where the notes would be helpful. But it’s often hard to motivate going back and reading the notes—not least because that’s only the beginning; there’s still the matter of implementing whatever’s in the notes.

But the point is that with a computational essay, once you’ve found what you want, the code to implement it is right there—immediately ready to be applied to whatever has come up.

Any Subject You Want

What can computational essays be about? Almost anything! I’ve often said that for any field of study X (from archaeology to zoology), there either is now, or soon will be, a “computational X”. And any “computational X” can immediately be explored and explained using computational essays.

But even when there isn’t a clear “computational X” yet, computational essays can still be a powerful way to organize and present material. In some sense, the very fact that a sequence of computations are typically needed to “tell the story” in an essay helps define a clear backbone for the whole essay. In effect, the structured nature of the computational presentation helps suggest structure for the narrative—making it easier for students (and others) to write essays that are easy to read and understand.

But what about actual subject matter? Well, imagine you’re studying history—say the history of the English Civil War. Well, conveniently, the Wolfram Language has a lot of knowledge about history (as about so many other things) built in. So you can present the English Civil War through a kind of dialog with it. For example, you can ask it for the geography of battles:

DynamicModuleBox[{Typeset`query$$ = "English Civil War",
      Typeset`boxes$$ = TemplateBox[{"\"English Civil War\"",
RowBox[{"Entity", "[",
RowBox[{"\"MilitaryConflict\"", ",", "\"EnglishCivilWar\""}], "]"}],
        "\"Entity[\\\"MilitaryConflict\\\", \
\\\"EnglishCivilWar\\\"]\"", "\"military conflict\""}, "Entity"],
      Typeset`allassumptions$$ = {{
       "type" -> "Clash", "word" -> "English Civil War",
        "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
        "Values" -> {{
          "name" -> "MilitaryConflict",
           "desc" -> "a military conflict",
           "input" -> "*C.English+Civil+War-_*MilitaryConflict-"}, {
          "name" -> "Word", "desc" -> "a word",
           "input" -> "*C.English+Civil+War-_*Word-"}, {
          "name" -> "HistoricalEvent", "desc" -> "a historical event",
            "input" -> "*C.English+Civil+War-_*HistoricalEvent-"}}}, {
       "type" -> "SubCategory", "word" -> "English Civil War",
        "template" -> "Assuming ${desc1}. Use ${desc2} instead",
        "count" -> "4",
        "Values" -> {{
          "name" -> "EnglishCivilWar",
           "desc" -> "English Civil War (1642 - 1651)",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
EnglishCivilWar-"}, {
          "name" -> "FirstEnglishCivilWar",
           "desc" -> "English Civil War (1642 - 1646)",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
FirstEnglishCivilWar-"}, {
          "name" -> "SecondEnglishCivilWar",
           "desc" -> "Second English Civil War",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
SecondEnglishCivilWar-"}, {
          "name" -> "ThirdEnglishCivilWar",
           "desc" -> "Third English Civil War",
           "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_*\
ThirdEnglishCivilWar-"}}}}, Typeset`assumptions$$ = {},
      Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {
      "Online" -> True, "Allowed" -> True,
       "mparse.jsp" -> 1.305362`6.5672759594240935,
       "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{265., {7., 17.}},
        Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
         Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],

You could ask for a timeline of the beginning of the war (you don’t need to say “first 15 battles”, because if one cares, one can just read that from the Wolfram Language code):

DynamicModuleBox[{Typeset`query$$ = "English Civil War",
       Typeset`boxes$$ = TemplateBox[{"\"English Civil War\"",
RowBox[{"Entity", "[",
RowBox[{"\"MilitaryConflict\"", ",", "\"EnglishCivilWar\""}], "]"}],
         "\"Entity[\\\"MilitaryConflict\\\", \\\"EnglishCivilWar\\\"]\
\"", "\"military conflict\""}, "Entity"],
       Typeset`allassumptions$$ = {{
        "type" -> "Clash", "word" -> "English Civil War",
         "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "3",
         "Values" -> {{
           "name" -> "MilitaryConflict",
            "desc" -> "a military conflict",
            "input" -> "*C.English+Civil+War-_*MilitaryConflict-"}, {
           "name" -> "Word", "desc" -> "a word",
            "input" -> "*C.English+Civil+War-_*Word-"}, {
           "name" -> "HistoricalEvent",
            "desc" -> "a historical event",
            "input" -> "*C.English+Civil+War-_*HistoricalEvent-"}}}, {
        "type" -> "SubCategory", "word" -> "English Civil War",
         "template" -> "Assuming ${desc1}. Use ${desc2} instead",
         "count" -> "4",
         "Values" -> {{
           "name" -> "EnglishCivilWar",
            "desc" -> "English Civil War (1642 - 1651)",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*EnglishCivilWar-"}, {
           "name" -> "FirstEnglishCivilWar",
            "desc" -> "English Civil War (1642 - 1646)",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*FirstEnglishCivilWar-"}, {
           "name" -> "SecondEnglishCivilWar",
            "desc" -> "Second English Civil War",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*SecondEnglishCivilWar-"}, {
           "name" -> "ThirdEnglishCivilWar",
            "desc" -> "Third English Civil War",
            "input" -> "*DPClash.MilitaryConflictE.English+Civil+War-_\
*ThirdEnglishCivilWar-"}}}}, Typeset`assumptions$$ = {},
       Typeset`open$$ = {1, 2}, Typeset`querystate$$ = {
       "Online" -> True, "Allowed" -> True,
        "mparse.jsp" -> 1.305362`6.5672759594240935,
        "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{275., {7., 17.}},
         Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
          Typeset`assumptions$$, Typeset`open$$,
SelectWithContents->True]\)["Battles"], 15]]

You could start looking at how armies moved, or who won and who lost at different points. At first, you can write a computational essay in which the computations are basically just generating custom infographics to illustrate your narrative. But then you can go further—and start really doing “computational history”. You can start to compute various statistical measures of the progress of the war. You can find ways to quantitatively compare it to other wars, and so on.

Can you make a “computational essay” about art? Absolutely. Maybe about art history. Pick 10 random paintings by van Gogh:

van Gogh paintings output

DynamicModuleBox[{Typeset`query$$ = "van gogh", Typeset`boxes$$ =
       TemplateBox[{"\"Vincent van Gogh\"",
RowBox[{"Entity", "[",
RowBox[{"\"Person\"", ",", "\"VincentVanGogh::9vq62\""}], "]"}],
         "\"Entity[\\\"Person\\\", \\\"VincentVanGogh::9vq62\\\"]\"",
         "\"person\""}, "Entity"],
       Typeset`allassumptions$$ = {{
        "type" -> "Clash", "word" -> "van gogh",
         "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "4",
         "Values" -> {{
           "name" -> "Person", "desc" -> "a person",
            "input" -> "*C.van+gogh-_*Person-"}, {
           "name" -> "Movie", "desc" -> "a movie",
            "input" -> "*C.van+gogh-_*Movie-"}, {
           "name" -> "SolarSystemFeature",
            "desc" -> "a solar system feature",
            "input" -> "*C.van+gogh-_*SolarSystemFeature-"}, {
           "name" -> "Word", "desc" -> "a word",
            "input" -> "*C.van+gogh-_*Word-"}}}},
       Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
       Typeset`querystate$$ = {
       "Online" -> True, "Allowed" -> True,
        "mparse.jsp" -> 0.472412`6.125865914333281,
        "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{227., {7., 17.}},
         Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
          Typeset`assumptions$$, Typeset`open$$,
SelectWithContents->True]\)["NotableArtworks"], 10], "Image"]

Then look at what colors they use (a surprisingly narrow selection):



Or maybe one could write a computational essay about actually creating art, or music.

What about science? You could rediscover Kepler’s laws by looking at properties of planets:

DynamicModuleBox[{Typeset`query$$ = "planets", Typeset`boxes$$ =
RowBox[{"EntityClass", "[",
RowBox[{"\"Planet\"", ",", "All"}], "]"}],
       "\"EntityClass[\\\"Planet\\\", All]\"", "\"planets\""},
     Typeset`allassumptions$$ = {{
      "type" -> "Clash", "word" -> "planets",
       "template" -> "Assuming \"${word}\" is ${desc1}. Use as \
${desc2} instead", "count" -> "4",
       "Values" -> {{
         "name" -> "PlanetClass", "desc" -> " referring to planets",
          "input" -> "*C.planets-_*PlanetClass-"}, {
         "name" -> "ExoplanetClass",
          "desc" -> " referring to exoplanets",
          "input" -> "*C.planets-_*ExoplanetClass-"}, {
         "name" -> "MinorPlanetClass",
          "desc" -> " referring to minor planets",
          "input" -> "*C.planets-_*MinorPlanetClass-"}, {
         "name" -> "Word", "desc" -> "a word",
          "input" -> "*C.planets-_*Word-"}}}},
     Typeset`assumptions$$ = {}, Typeset`open$$ = {1, 2},
     Typeset`querystate$$ = {
     "Online" -> True, "Allowed" -> True,
      "mparse.jsp" -> 0.400862`6.054539882441674, "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{171., {7., 17.}},
       Typeset`query$$, Typeset`boxes$$, Typeset`allassumptions$$,
        Typeset`assumptions$$, Typeset`open$$, Typeset`querystate$$}],
SelectWithContents->True]\)[{"DistanceFromSun", "OrbitPeriod"}]



Maybe you could go on and check it for exoplanets. Or you could start solving the equations of motion for planets.

You could look at biology. Here’s the first beginning of the reference sequence for the human mitochondrion:

GenomeData[{"Mitochondrion", {1, 150}}]

GenomeData[{"Mitochondrion", {1, 150}}]

You can start off breaking it into possible codons:

StringPartition[%, 3]

StringPartition[%, 3]

There’s an immense amount of data about all kinds of things built into the Wolfram Language. But there’s also the Wolfram Data Repository, which contains all sorts of specific datasets. Like here’s a map of state fairgrounds in the US:

GeoListPlot[  ResourceData["U.S. State Fairgrounds"][All, "GeoPosition"]]

 ResourceData["U.S. State Fairgrounds"][All, "GeoPosition"]]

And here’s a word cloud of the constitutions of countries that have been enacted since 2010:

  Normal[ResourceData["World Constitutions"][
    Select[#YearEnacted > \!\(\*
DynamicModuleBox[{Typeset`query$$ = "year 2010", Typeset`boxes$$ =
           RowBox[{"DateObject", "[",
RowBox[{"{", "2010", "}"}], "]"}],
           Typeset`allassumptions$$ = {{
            "type" -> "MultiClash", "word" -> "",
             "template" -> "Assuming ${word1} is referring to \
${desc1}. Use \"${word2}\" as ${desc2}.", "count" -> "2",
             "Values" -> {{
               "name" -> "PseudoTokenYear", "word" -> "year 2010",
                "desc" -> "a year",
                "input" -> "*MC.year+2010-_*PseudoTokenYear-"}, {
               "name" -> "Unit", "word" -> "year", "desc" -> "a unit",
                 "input" -> "*MC.year+2010-_*Unit-"}}}},
           Typeset`assumptions$$ = {}, Typeset`open$$ = {1},
           Typeset`querystate$$ = {
           "Online" -> True, "Allowed" -> True,
            "mparse.jsp" -> 0.542662`6.186074404594303,
            "Messages" -> {}}},
AlphaIntegration`LinguisticAssistantBoxes["", 4, Automatic,
Dynamic[Typeset`querystate$$]], StandardForm],
ImageSizeCache->{86., {7., 18.}},
             Typeset`query$$, Typeset`boxes$$,
              Typeset`allassumptions$$, Typeset`assumptions$$,
              Typeset`open$$, Typeset`querystate$$}],
SelectWithContents->True]\) &], "Text"]]]]

Quite often one’s interested in dealing not with public data, but with some kind of local data. One convenient source of this is the Wolfram Data Drop. In an educational setting, particular databins (or cloud objects in general) can be set so that they can be read (and/or added to) by some particular group. Here’s a databin that I accumulate for myself, showing my heart rate through the day. Here it is for today:



Of course, it’s easy to make a histogram too:



What about math? A key issue in math is to understand why things are true. The traditional approach to this is to give proofs. But computational essays provide an alternative. The nature of the steps in them is different—but the objective is the same: to show what’s true and why.

As a very simple example, let’s look at primes. Here are the first 50:

Table[Prime[n], {n, 50}]

Table[Prime[n], {n, 50}]

Let’s find the remainder mod 6 for all these primes:

Mod[Table[Prime[n], {n, 50}], 6]

Mod[Table[Prime[n], {n, 50}], 6]

But why do only 1 and 5 occur (well, after the trivial cases of the primes 2 and 3)? We can see this by computation. Any number can be written as 6n+k for some n and k:

Table[6 n + k, {k, 0, 5}]

Table[6 n + k, {k, 0, 5}]

But if we factor numbers written in this form, we’ll see that 6n+1 and 6n+5 are the only ones that don’t have to be multiples:



What about computer science? One could for example write a computational essay about implementing Euclid’s algorithm, studying its running time, and so on.

Define a function to give all steps in Euclid’s algorithm:

gcdlist[a_, b_] :=   NestWhileList[{Last[#], Apply[Mod, #]} &, {a, b}, Last[#] != 0 &, 1]

gcdlist[a_, b_] :=
 NestWhileList[{Last[#], Apply[Mod, #]} &, {a, b}, Last[#] != 0 &, 1]

Find the distribution of running lengths for the algorithm for numbers up to 200:

Histogram[Flatten[Table[Length[gcdlist[i, j]], {i, 200}, {j, 200}]]]

Histogram[Flatten[Table[Length[gcdlist[i, j]], {i, 200}, {j, 200}]]]

Or in modern times, one could explore machine learning, starting, say, by making a feature space plot of part of the MNIST handwritten digits dataset:

FeatureSpacePlot[RandomSample[Keys[ResourceData["MNIST"]], 50]]

FeatureSpacePlot[RandomSample[Keys[ResourceData["MNIST"]], 50]]

If you wanted to get deeper into software engineering, you could write a computational essay about the HTTP protocol. This gets an HTTP response from a site:



And this shows the tree structure of the elements on the webpage at that URL:

TreeForm[Import["http://www.wolframalpha.com", {"HTML", "XMLObject"}],   VertexLabeling -> False, AspectRatio -> 1/2]

TreeForm[Import["http://www.wolframalpha.com", {"HTML", "XMLObject"}],
  VertexLabeling -> False, AspectRatio -> 1/2]

Or—in a completely different direction—you could talk about anatomy:

AnatomyPlot3D[left foot]

AnatomyPlot3D[Entity["AnatomicalStructure", "LeftFoot"]]

What Makes a Good Computational Essay?

As far as I’m concerned, for a computational essay to be good, it has to be as easy to understand as possible. The format helps quite a lot, of course. Because a computational essay is full of outputs (often graphical) that are easy to skim, and that immediately give some impression of what the essay is trying to say. It also helps that computational essays are structured documents, that deliver information in well-encapsulated pieces.

But ultimately it’s up to the author of a computational essay to make it clear. But another thing that helps is that the nature of a computational essay is that it must have a “computational narrative”—a sequence of pieces of code that the computer can execute to do what’s being discussed in the essay. And while one might be able to write an ordinary essay that doesn’t make much sense but still sounds good, one can’t ultimately do something like that in a computational essay. Because in the end the code is the code, and actually has to run and do things.

So what can go wrong? Well, like English prose, Wolfram Language code can be unnecessarily complicated, and hard to understand. In a good computational essay, both the ordinary text, and the code, should be as simple and clean as possible. I try to enforce this for myself by saying that each piece of input should be at most one or perhaps two lines long—and that the caption for the input should always be just one line long. If I’m trying to do something where the core of it (perhaps excluding things like display options) takes more than a line of code, then I break it up, explaining each line separately.

Another important principle as far as I’m concerned is: be explicit. Don’t have some variable that, say, implicitly stores a list of words. Actually show at least part of the list, so people can explicitly see what it’s like. And when the output is complicated, find some tabulation or visualization that makes the features you’re interested in obvious. Don’t let the “key result” be hidden in something that’s tucked away in the corner; make sure the way you set things up makes it front and center.

Use the structured nature of notebooks. Break up computational essays with section headings, again helping to make them easy to skim. I follow the style of having a “caption line” before each input. Don’t worry if this somewhat repeats what a paragraph of text has said; consider the caption something that someone who’s just “looking at the pictures” might read to understand what a picture is of, before they actually dive into the full textual narrative.

The technology of Wolfram Notebooks makes it straightforward to put in interactive elements, like Manipulate, into computational essays. And sometimes this is very helpful, and perhaps even essential. But interactive elements shouldn’t be overused. Because whenever there’s an element that requires interaction, this reduces the ability to skim the essay.

Sometimes there’s a fair amount of data—or code—that’s needed to set up a particular computational essay. The cloud is very useful for handling this. Just deploy the data (or code) to the Wolfram Cloud, and set appropriate permissions so it can automatically be read whenever the code in your essay is executed.

Notebooks also allow “reverse closing” of cells—allowing an output cell to be immediately visible, even though the input cell that generated it is initially closed. This kind of hiding of code should generally be avoided in the body of a computational essay, but it’s sometimes useful at the beginning or end of an essay, either to give an indication of what’s coming, or to include something more advanced where you don’t want to go through in detail how it’s made.

OK, so if a computational essay is done, say, as homework, how can it be assessed? A first, straightforward question is: does the code run? And this can be determined pretty much automatically. Then after that, the assessment process is very much like it would be for an ordinary essay. Of course, it’s nice and easy to add cells into a notebook to give comments on what’s there. And those cells can contain runnable code—that for example can take results in the essay and process or check them.

Are there principles of good computational essays? Here are a few candidates:

0. Understand what you’re talking about (!)

1. Find the most straightforward and direct way to represent your subject matter

2. Keep the core of each piece of Wolfram Language input to a line or two

3. Use explicit visualization or other information presentation as much as possible

4. Try to make each input+caption independently understandable

5. Break different topics or directions into different subsections

Learning the Language

At the core of computational essays is the idea of expressing computational thoughts using the Wolfram Language. But to do that, one has to know the language. Now, unlike human languages, the Wolfram Language is explicitly designed (and, yes, that’s what I’ve been doing for the past 30+ years) to follow definite principles and to be as easy to learn as possible. But there’s still learning to be done.

One feature of the Wolfram Language is that—like with human languages—it’s typically easier to read than to write. And that means that a good way for people to learn what they need to be able to write computational essays is for them first to read a bunch of essays. Perhaps then they can start to modify those essays. Or they can start creating “notes essays”, based on code generated in livecoding or other classroom sessions.

As people get more fluent in writing the Wolfram Language, something interesting happens: they start actually expressing themselves in the language, and using Wolfram Language input to carry significant parts of the narrative in a computational essay.

When I was writing An Elementary Introduction to the Wolfram Language (which itself is written in large part as a sequence of computational essays) I had an interesting experience. Early in the book, it was decently easy to explain computational exercises in English (“Make a table of the first 10 squares”). But a little later in the book, it became a frustrating process.

It was easy to express what I wanted in the Wolfram Language. But to express it in English was long and awkward (and had a tendency of sounding like legalese). And that’s the whole point of using the Wolfram Language, and the reason I’ve spent 30+ years building it: because it provides a better, crisper way to express computational thoughts.

It’s sometimes said of human languages that the language you use determines how you think. It’s not clear how true this is of human languages. But it’s absolutely true of computer languages. And one of the most powerful things about the Wolfram Language is that it helps one formulate clear computational thinking.

Traditional computer languages are about writing code that describes the details of what a computer should do. The point of the Wolfram Language is to provide something much higher level—that can immediately talk about things in the world, and that can allow people as directly as possible to use it as a medium of computational thinking. And in a sense that’s what makes a good computational essay possible.

The Long Path to Computational Essays

Now that we have full-fledged computational essays, I realize I’ve been on a path towards them for nearly 40 years. At first I was taking interactive computer output and Scotch-taping descriptions into it:

By 1981, when I built SMP, I was routinely writing documents that interspersed code and explanations:

But it was only in 1986, when I started documenting what became Mathematica and the Wolfram Language, that I started seriously developing a style close to what I now favor for computational essays:

And with the release of Mathematica 1.0 in 1988 came another critical element: the invention of Wolfram Notebooks. Notebooks arrived in a form at least superficially very similar to the way they are today (and already in many ways more sophisticated than the imitations that started appearing 25+ years later!): collections of cells arranged into groups, and capable of containing text, executable code, graphics, etc.

Early Mac notebooks

At first notebooks were only possible on Mac and NeXT computers. A few years later they were extended to Microsoft Windows and X Windows (and later, Linux). But immediately people started using notebooks both to provide reports about they’d done, and to create rich expository and educational material. Within a couple of years, there started to be courses based on notebooks, and books printed from notebooks, with interactive versions available on CD-ROM at the back:

So in a sense the raw material for computational essays already existed by the beginning of the 1990s. But to really make computational essays come into their own required the development of the cloud—as well as the whole broad range of computational knowledge that’s now part of the Wolfram Language.

By 1990 it was perfectly possible to create a notebook with a narrative, and people did it, particularly about topics like mathematics. But if there was real-world data involved, things got messy. One had to make sure that whatever was needed was appropriately available from a distribution CD-ROM or whatever. We created a Player for notebooks very early, that was sometimes distributed with notebooks.

But in the last few years, particularly with the development of the Wolfram Cloud, things have gotten much more streamlined. Because now you can seamlessly store things in the cloud and use them anywhere. And you can work directly with notebooks in the cloud, just using a web browser. In addition, thanks to lots of user-assistance innovations (including natural language input), it’s become even easier to write in the Wolfram Language—and there’s ever more that can be achieved by doing so.

And the important thing that I think has now definitively happened is that it’s become lightweight enough to produce a good computational essay that it makes sense to do it as something routine—either professionally in writing reports, or as a student doing homework.

Ancient Educational History

The idea of students producing computational essays is something new for modern times, made possible by a whole stack of current technology. But there’s a curious resonance with something from the distant past. You see, if you’d learned a subject like math in the US a couple of hundred years ago, a big thing you’d have done is to create a so-called ciphering book—in which over the course of several years you carefully wrote out the solutions to a range of problems, mixing explanations with calculations. And the idea then was that you kept your ciphering book for the rest of your life, referring to it whenever you needed to solve problems like the ones it included.

Well, now, with computational essays you can do very much the same thing. The problems you can address are vastly more sophisticated and wide-ranging than you could reach with hand calculation. But like with ciphering books, you can write computational essays so they’ll be useful to you in the future—though now you won’t have to imitate calculations by hand; instead you’ll just edit your computational essay notebook and immediately rerun the Wolfram Language inputs in it.

I actually only learned about ciphering books quite recently. For about 20 years I’d had essentially as an artwork a curious handwritten notebook (created in 1818, it says, by a certain George Lehman, apparently of Orwigsburg, Pennsylvania), with pages like this:

I now know this is a ciphering book—that on this page describes how to find the “height of a perpendicular object… by having the length of the shadow given”. And of course I can’t resist a modern computational essay analog, which, needless to say, can be a bit more elaborate.

Find the current position of the Sun as azimuth, altitude:



Find the length of a shadow for an object of unit height:



Given a 10-ft shadow, find the height of the object that made it:



The Path Ahead

I like writing textual essays (such as blog posts!). But I like writing computational essays more. Because at least for many of the things I want to communicate, I find them a purer and more efficient way to do it. I could spend lots of words trying to express an idea—or I can just give a little piece of Wolfram Language input that expresses the idea very directly and shows how it works by generating (often very visual) output with it.

When I wrote my big book A New Kind of Science (from 1991 to 2002), neither our technology nor the world was quite ready for computational essays in the form in which they’re now possible. My research for the book filled thousands of Wolfram Notebooks. But when it actually came to putting together the book, I just showed the results from those notebooks—including a little of the code from them in notes at the back of the book.

But now the story of the book can be told in computational essays—that I’ve been starting to produce. (Just for fun, I’ve been livestreaming some of the work I’m doing to create these.) And what’s very satisfying is just how clearly and crisply the ideas in the book can be communicated in computational essays.

There is so much potential in computational essays. And indeed we’re now starting the project of collecting “topic explorations” that use computational essays to explore a vast range of topics in unprecedentedly clear and direct ways. It’ll be something like our Wolfram Demonstrations Project (that now has 11,000+ Wolfram Language–powered Demonstrations). Here’s a typical example I wrote:

The Central Limit Theorem

Computational essays open up all sorts of new types of communication. Research papers that directly present computational experiments and explorations. Reports that describe things that have been found, but allow other cases to be immediately explored. And, of course, computational essays define a way for students (and others) to very directly and usefully showcase what they’ve learned.

There’s something satisfying about both writing—and reading—computational essays. It’s as if in communicating ideas we’re finally able to go beyond pure human effort—and actually leverage the power of computation. And for me, having built the Wolfram Language to be a computational communication language, it’s wonderful to see how it can be used to communicate so effectively in computational essays.

It’s so nice when I get something sent to me as a well-formed computational essay. Because I immediately know that I’m going to get a straight story that I can actually understand. There aren’t going to be all sorts of missing sources and hidden assumptions; there’s just going to be Wolfram Language input that stands alone, and that I can take out and study or run for myself.

The modern world of the web has brought us a few new formats for communication—like blogs, and social media, and things like Wikipedia. But all of these still follow the basic concept of text + pictures that’s existed since the beginning of the age of literacy. With computational essays we finally have something new—and it’s going to be exciting to see all the things it makes possible.

Комментарии (0)

6. Limits without Limits in Version 11.2Чт., 09 нояб.[−]

Limits lead image

Here are 10 terms in a sequence:

Table[(2/(2 n + 1)) ((2 n)!!/(2 n - 1)!!)^2, {n, 10}]

And here’s what their numerical values are:


But what is the limit of the sequence? What would one get if one continued the sequence forever?

In Mathematica and the Wolfram Language, there’s a function to compute that:

DiscreteLimit[(2/(2 n + 1)) ((2 n)!!/(2 n - 1)!!)^2, n -> \[Infinity]]

Limits are a central concept in many areas, including number theory, geometry and computational complexity. They’re also at the heart of calculus, not least since they’re used to define the very notions of derivatives and integrals.

Mathematica and the Wolfram Language have always had capabilities for computing limits; in Version 11.2, they’ve been dramatically expanded. We’ve leveraged many areas of the Wolfram Language to achieve this, and we’ve invented some completely new algorithms too. And to make sure we’ve covered what people want, we’ve sampled over a million limits from Wolfram|Alpha.

Let’s talk about a limit that Hardy and Ramanujan worked out in 1918. But let’s build up to that. First, consider the sequence a(n) that is defined as follows:

a[n_] := (-1)^n/n

Here is a table of the first ten values for the sequence.

Table[a[n], {n, 1, 10}]

The following plot indicates that the sequence converges to 0 as n approaches Infinity.

DiscretePlot[a[n], {n, 1, 40}]

The DiscreteLimit function, which was introduced in Version 11.2, confirms that the limit of this sequence is indeed 0.

DiscreteLimit[a[n], n -> \[Infinity]]

Many sequences that arise in practice (for example, in signal communication) are periodic in the sense that their values repeat themselves at regular intervals. The length of any such interval is called the period of the sequence. As an example, consider the following sequence that is defined using Mod.

a[n_] := Mod[n, 6]

A plot of the sequence shows that the sequence is periodic with period 6.

DiscretePlot[a[n], {n, 0, 20}]

In contrast to our first example, this sequence does not converge, since it oscillates between 0 and 5. Hence, DiscreteLimit returns Indeterminate in this case.

DiscreteLimit[a[n], n -> \[Infinity]]

The new Version 11.2 functions DiscreteMinLimit and DiscreteMaxLimit can be used to compute the lower and upper limits of oscillation, respectively, in such cases. Thus, we have:

DiscreteMinLimit[a[n], n -> \[Infinity]]

DiscreteMaxLimit[a[n], n -> \[Infinity]]

DiscreteMinLimit and DiscreteMaxLimit are often referred to as “ lim inf” and “ lim sup,” respectively, in the mathematical literature. The traditional underbar and overbar notations for these limits are available, as shown here.

 \!\(\*UnderscriptBox[\(\[MinLimit]\), \(n\* UnderscriptBox["\[Rule]",  TemplateBox[{}, "Integers"]]\[Infinity]\)]\) a[n]

 \!\(\*UnderscriptBox[\(\[MaxLimit]\), \(n\* UnderscriptBox["\[Rule]",  TemplateBox[{}, "Integers"]]\[Infinity]\)]\) a[n]

Our next example is an oscillatory sequence that is built from the trigonometric functions Sin and Cos, and is defined as follows.

a[n_] := Sin[2 n]^2/(2 + Cos[n])

Although Sin and Cos are periodic when viewed as functions over the real numbers, this integer sequence behaves in a bizarre manner and is very far from being a periodic sequence, as confirmed by the following plot.

DiscretePlot[a[n], {n, 1, 100}]

Hence, the limit of this sequence does not exist.

DiscreteLimit[a[n], n -> \[Infinity]]

However, it turns out that for such “densely aperiodic sequences,” the extreme values can be computed by regarding them as real functions. DiscreteMinLimit uses this method to return the answer 0 for the example, as expected.

DiscreteMinLimit[a[n], n -> \[Infinity]]

Using the same method, DiscreteMaxLimit returns a rather messy-looking result in terms of Root objects for this example.

DiscreteMaxLimit[a[n], n -> \[Infinity]]

The numerical value of this result is close to 0.8, as one might have guessed from the graph.


Discrete limits also occur in a natural way when we try to compute the value of infinitely nested radicals. For example, consider the problem of evaluating the following nested radical.

Nested radical

The successive terms in the expansion of the radical can be generated by using RSolveValue, since the sequence satisfies a nonlinear recurrence. For example, the third term in the expansion is obtained as follows.

RSolveValue[{r[n + 1] == Sqrt[2 + r[n]], r[1] == Sqrt[2]}, r[3], n]

The value of the infinitely nested radical appears to be 2, as seen from the following plot that is generated using RecurrenceTable.

ListPlot[RecurrenceTable[{r[n + 1] == Sqrt[2 + r[n]],      r[1] == Sqrt[2]}, r[n], {n, 2, 35}]]

Using Version 11.2, we can confirm that the limiting value is indeed 2 by requesting the value r(∞) in RSolveValue, as shown here.

RSolveValue[{r[n + 1] == Sqrt[2 + r[n]], r[1] == 2},   r[\[Infinity]], n]

The study of limits belongs to the branch of mathematics called asymptotic analysis. Asymptotic analysis provides methods for obtaining approximate solutions of problems near a specific value such as 0 or Infinity. It turns out that, in practice, the efficiency of asymptotic approximations often increases precisely in the regime where the corresponding exact computation becomes difficult! A striking example of this phenomenon is seen in the study of integer partitions, which are known to grow extremely fast as the size of the number increases. For example, the number 6 can be partitioned in 11 distinct ways using IntegerPartitions, as shown here.

IntegerPartitions[6] // TableForm


The number of distinct partitions can be found directly using PartitionsP as follows.


As noted earlier, the number of partitions grows rapidly with the size of the integer. For example, there are nearly 4 trillion partitions of the number 200.



In 1918, Hardy and Ramanujan provided an asymptotic approximation for this number, which is given by the following formula.

asymp[n_] := E^(\[Pi] Sqrt[(2 n)/3])/(4 n Sqrt[3])

The answer given by this estimate for the number 200 is remarkably close to 4 trillion.

asymp[200] // N

With a much larger integer, we get an even better approximation for the number of partitions almost instantaneously, as seen in the following example.

PartitionsP[2000000000] // N // Timing

N[asymp[2000000000] , 20] // Timing

Finally, we can confirm that the asymptotic estimate approaches the number of partitions as n tends to Infinity using DiscreteLimit, which is aware of the Hardy–Ramanujan formula discussed above.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] 0\)]\) Sin[x]/x

Formally, we say that exact and approximate formulas for the number of partitions are asymptotically equivalent as n approaches Infinity.

Asymptotic notions also play an important rule in the study of function limits. For instance, the small-angle approximation in trigonometry asserts that “sin(x) is nearly equal to x for small values of x.” This may be rephrased as “sin(x) is asymptotically equivalent to x as x approaches 0.” A formal statement of this result can be given using Limit, which computes function limits, as follows.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] 0\)]\) Sin[x]/x

This plot provides visual confirmation that the limit is indeed 1.

Plot[Sin[x]/x, {x, -20, 20}, PlotRange -> All]

The above limit can also be calculated using L’H?spital’s rule by computing the derivatives, cos(x) and 1, of the numerator and denominator respectively, as shown here.

Limit[Cos[x]/1, x -> 0]

L’H?spital’s rule gives a powerful method for evaluating many limits that occur in practice. However, it may require a large number of steps before arriving at the answer. For example, consider the following limit.

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) x^6/E^  x

That limit requires six repeated applications of L’H?spital’s rule to arrive at the answer 0, since all the intermediate computations give indeterminate results.

Table[ \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) \!\( \*SubscriptBox[\(\[PartialD]\), \({x, n}\)] \*SuperscriptBox[\(x\), \(6\)]\)/ \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) \!\( \*SubscriptBox[\(\[PartialD]\), \({x, n}\)] \*SuperscriptBox[\(E\), \(x\)]\), {n, 0, 10}]

Thus, we see that L’H?spital’s rule has limited utility as a practical algorithm for finding function limits, since it is impossible to decide when the algorithm should stop! Hence, the built-in Limit function uses a combination of series expansions and modern algorithms that works well on inputs involving exponentials and logarithms, the so-called “exp-log” class. In fact, Limit has received a substantial update in Version 11.2 and now handles a wide variety of difficult examples, such as the following, in a rather comprehensive manner (the last two examples work only in the latest release).

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\)   Gamma[x + 1/2]/(Gamma[x] Sqrt[x])

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) (  Log[x] (-Log[Log[x]] + Log[Log[x] + Log[Log[x]]]))/  Log[Log[x] + Log[Log[Log[x]]]]

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\) E^E^E^  PolyGamma[PolyGamma[PolyGamma[x]]]/x

 \!\(\*UnderscriptBox[\(\[Limit]\), \(x \[Rule] \[Infinity]\)]\)   E^(E^x + x^2) (-Erf[E^-E^x - x] - Erf[x])

As in the cases of sequences, the limits of periodic and oscillatory functions will often not exist. One can then use MaxLimit and MinLimit, which, like their discrete counterparts, give tight bounds for the oscillation of the function near a given value, as in this classic example.

f[x_] := Sin[1/x]

Plot[f[x], {x, -1, 1}]

The graph indicates the function oscillates rapidly between –1 and 1 near 0. These bounds are confirmed by MaxLimit and MinLimit, while Limit itself returns Indeterminate.

{Limit[f[x], x -> 0], MinLimit[f[x], x -> 0], MaxLimit[f[x], x -> 0]}

In the previous example, the limit fails to exist because the function oscillates wildly around the origin. Discontinuous functions provide other types of examples where the limit at a point may fail to exist. We will now consider an example of such a function with a jump discontinuity at the origin and other values. The function is defined in terms of SquareWave and FresnelS, as follows.

g[x_] := (SquareWave[x] FresnelS[x])/x^3

This plot shows the jump discontinuities, which are caused by the presence of SquareWave in the definition of the function.

Plot[{g[x], -Pi/6, Pi/6}, {x, -2, 2},   ExclusionsStyle -> Directive[Red, Dashed]]

We see that the limiting values of the function at 0, for instance, depend on the direction from which we approach the origin. The limiting value from the right (“from above”) can be calculated using the Direction option.

Limit[g[x], x -> 0, Direction -> "FromAbove"]

Similarly, the limit from the left can be calculated as follows.

Limit[g[x], x -> 0, Direction -> "FromBelow"]

The limit, if it exists, is the “two-sided” limit for the function that, in this case, does not exist.

Limit[g[x], x -> 0, Direction -> "TwoSided"]

By default, Limit computes two-sided limits in Version 11.2. This is a change from earlier versions, where it computed the limit from above by default. Hence, we get an Indeterminate result from Limit, with no setting for the Direction option.

Limit[g[x], x -> 0]

Directional limits acquire even more significance in the multivariate case, since there are many possible directions for approaching a given point in higher dimensions. For example, consider the bivariate function f(x,y) that is defined as follows.

f[x_, y_] := (x y)/(x^2 + y^2)

The limiting value of this function at the origin is 0 if we approach it along the x axis, which is given by y=0, since the function has the constant value 0 along this line.

f[x, 0]

Similarly, the limiting value of the function at the origin is 0 if we approach it along the y axis, which is given by x=0.

f[0, y]

However, the limit is 1/2 if we approach the origin along the line y=x, as seen here.

f[x, y] /. {y -> x}

More generally, the limiting value changes as we approach the origin along different lines y=m x.

f[x, y] /. {y -> m x} // Simplify

The directional dependence of the limiting value implies that the true multivariate limit does not exist. In Version 11.2, Limit handles multivariate examples with ease, and quickly returns the expected answer Indeterminate for the limit of this function at the origin.

Limit[f[x, y], {x, y} -> {0, 0}]

A plot of the surface z=f(x,y) confirms the behavior of the function near the origin.

Plot3D[f[x, y], {x, -4, 4}, {y, -4, 4}]

This example indicates that, in general, multivariate limits do not exist. In other cases, such as the following, the limit exists but the computation is subtle.

f[x_, y_] := (x^2 + y^2)/(3^(Abs[x] + Abs[y]) - 1)

This plot indicates that the limit of this function at {0,0} exists and is 0, since the function values appear to approach 0 from all directions.

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}, PlotRange -> All]

The answer can be confirmed by applying Limit to the function directly.

Limit[f[x, y], {x, y} -> {0, 0}]

A rich source of multivariate limit examples is provided by the steady stream of inputs that is received by Wolfram|Alpha each day. We acquired around 100,000 anonymized queries to Wolfram|Alpha from earlier years, which were then evaluated using Version 11.2 . Here is a fairly complicated example from this vast collection that Limit handles with ease in the latest version.

f[x_, y_] := Cos[Abs[x] Abs[y]] - 1

Plot3D[f[x, y], {x, -3, 3}, {y, -3, 3}]

Limit[f[x, y], {x, y} -> {0, 0}]

It is a sheer joy to browse through the examples from Wolfram|Alpha, so we decided to share 1,000 nontrivial examples from the collection with you. Sample images of the examples are shown below. The five notebooks with the examples can be downloaded here.

Downloadable notebooks

Version 11.2 evaluates 90% of the entire collection in the benchmark, which is remarkable since the functionality for multivariate limits is new in this release.

Limits pie chart

Version 11.2 also evaluates a higher fraction (96%) of an even larger collection of 1,000,000 univariate limits from Wolfram|Alpha when compared with Version 11.1 (94%). The small percentage difference between the two versions can be explained by noting that most Wolfram|Alpha queries for univariate limits relate to a first or second course in college calculus and are easily computed by Limit in either version.

Limit has been one of the most dependable functions in the Wolfram Language ever since it was first introduced in Version 1 (1988). The improvements for this function, along with DiscreteLimit and other new functions in Version 11.2, have facilitated our journey through the world of limits. I hope that you have enjoyed this brief tour, and welcome any comments or suggestions about the new features.

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

Комментарии (0)

7. What Can You Say in One Line of the Wolfram Language? The 2017 One-Liner CompetitionСр., 08 нояб.[−]

The One-Liner Competition is a tradition at our annual Wolfram Technology Conference, which took place at our headquarters in Champaign, Illinois, two weeks ago. We challenge attendees to show us the most impressive effects they can achieve with 128 characters or fewer of Wolfram Language code. We are never disappointed, and often surprised by what they show us can be done with the language we work so hard to develop—the language we think is the world’s most powerful and fun.

Melting flags

This year’s winning submissions included melting flags, computer vision and poetry. Read on to see how far you can go with just a few characters of Wolfram Language code…

Honorable Mention
Pedro Fonseca: Dynamically Restyled Wolf (128 characters)

Pedro’s One-Liner submission riffed on another conference competition: use the new ImageRestyle function to make an appealing restyling of the wolf icon.

e = WebImageSearch["wolf", "Thumbnails"]; b = a = Rasterize[Style[\[Wolf], 99]]; i = ImageRestyle; Dynamic[b = i[a, i[b, .8 -> RandomChoice[e]]]]

Output 1Wolfie restyle

To stay within the 128-character limit, Pedro used the \[Wolf] character Wolf characterinstead of the icon. Embedding the restyling in a Dynamic expression so that it displays endless variations and using random wolf images to restyle the wolf icon were nice touches that favorably impressed the judges.

Honorable Mention
Edmund Robinson: Deep File Explorer (120 characters)

Edmund’s submission actually does something useful! (But as one snarky One-Liner judge pointed out, no submission that does something useful has ever won the competition.) His file explorer uses Dataset to make a nicely formatted browser of file properties, a way to get a quick overview of what’s in a file. A lot of beautiful, useful functionality in 120 characters of code!

Robinson one-line graphicRobinson one-line part 2

Honorable Mention
Daniel Reynolds: Super Name (132 characters)

The judges had fun with Daniel’s name generator. Unfortunately, as submitted, it was four characters over the 128-character limit. Surely an oversight, as it could easily be shortened, but nevertheless the judges were obliged to disqualify the submission. We hope you’ll participate again next year, Daniel.

"CONGRATS! Your new name is " <>   ToString[Capitalize[RandomWord[]]] <> " the ruler of " <>   ToString[Capitalize[RandomWord[]]] <> "sylvania!"

Third Place
Amy Friedman: Autumn Wolframku (83 characters)

Amy’s “Wolframku” generator is itself, at a mere 83 characters, programming poetry. Using WikipediaData to gather autumn-themed words and the brand-new TakeList function to form them poetically, it generates haiku-like verses—often indecipherable, sometimes surprisingly profound.

StringRiffle[  TakeList[RandomChoice[TextWords[WikipediaData["Autumn"]], 11], {3, 5,     3}]]

Amy, a professor of English, has learned the Wolfram Language in part with the encouragement and help of her son Jesse, the youngest-ever prizewinner in our One-Liner Competition, who took second place in 2014 at the age of 13.

Second Place
Peter Roberge: Toy Self-Driving Car (119 characters)

Peter’s submission does an amazing amount of image processing in a few characters of code, adjusting, recognizing and highlighting frames to identify and track vehicles in a video. The smarts are contained in the new ImageContents function, which Peter must have done some sleuthing to discover, since it is included in 11.2 but not yet documented.

e = ExampleData; e@e[e[][[12]]][[4]] /.   x_Graphics :>    HighlightImage[x,     Normal@ImageContents[ImageAdjust[x, {0, 0, 4.7}]][All, 1]]

Output 5Car tracker

First Place
George Varnavides: Animating the Melting Pot (128 characters)

George’s mastery of the Wolfram Language enabled him to squeak in right at the 128-character limit with his submission that expresses the “melting pot” metaphor both graphically—as animations of melting flags—and conceptually, since the melting effect is achieved by melding multiple flag images. His use of Echo, Infix operators, memoization and FoldList shows a deep understanding of the Wolfram Language. Nicely done, George!

f := a = DeleteMissing[#@"Flag" & /@ RandomEntity["Country", 9]] Echo@ListAnimate@    FoldList[ImageRestyle, f[[1]], Thread[.01 -> Rest@a]]~Do~9

Output 6Melting flags

There’s more! We can’t mention every submission individually, but all merit a look. You might even pick up some interesting coding ideas from them. You can see all of the submissions in this downloadable notebook. Thanks to all who participated and impressed us with their ideas and coding prowess! See you again next year.

Комментарии (0)

8. From Aircraft to Optics: Wolfram Innovator Awards 2017Чт., 02 нояб.[−]

As is tradition at the annual Wolfram Technology Conference, we recognize exceptional users and organizations for their innovative usage of our technologies across a variety of disciplines and fields.

Award winners with Stephen Wolfram

Nominated candidates undergo a vetting process, and are then evaluated by a panel of experts to determine winners. This year we’re excited to announce the recipients of the 2017 Wolfram Innovator Awards.

Youngjoo Chung

Youngjoo Chung

Dr. Chung is the creator of a very extensive symbolic computing and vector analysis package for the Wolfram Language. This package enhances our UI by allowing the user to use and symbolically manipulate expressions in traditional inline notation—the same kind of notation you see when you open up a textbook.

For Wolfram as a company, one of his particular accomplishments has been his presence in the international community of Wolfram Language users. He is the president of the Korean Mathematica Users group, and is very active in arranging Mathematica user conferences across South Korea.

Massimo Fazio

Massimo Fazio

Dr. Fazio is an ophthalmology professor who analyzes data from OCT, a powerful imaging technique that builds a 3D image from a sequence of layered 2D image slices of the eye. Usually this analysis is tedious, which makes what Dr. Fazio was able to do that much more remarkable: he has been using the 3D image processing capabilities of the Wolfram Language to automate the analysis of generated OCT images.

In the future, this automation will be extended to cover the entire diagnostic process of glaucoma just by looking at OCT images—with all computations being done using the Wolfram Cloud architecture, or even embedded within the actual devices that are making these measurements.

David Leigh-Lancaster

Mathematical Methods Computer-Based Exam Team

The Victorian Curriculum and Assessment Authority built a massive system using the Wolfram Language and Mathematica to allow students to go through their entire math education in a computer-based fashion, with actual assessments of student performance taking the form of small computational essays. This is now being done in a dozen or so schools in the state of Victoria.

Accepting the award for the team is Dr. David Leigh-Lancaster, who himself started off studying intuitionistic logic—like classical logic, but without the principle of the excluded middle or double negation rules. Eventually he moved into mathematics teaching and education policy, where he was introduced to Mathematica by secondary school students in the early 1990s; learning from his students, as good teachers often do, he quickly started using it in a very serious way.

His work resulted in the widespread use of Wolfram technologies—nearly 700,000 students in the state of Victoria now have access to our entire educational technology suite, making David instrumental in bringing a very broad license for Wolfram technology to a chunk of the country.

The efforts of David and the team are a neat example of the continuing modernization of mathematics education that’s made possible by the technology that’s been developed here for the last three decades.

David Milner

David Milner

David was introduced to Wolfram technologies last year through Wolfram SystemModeler, using it to fully render and model military vehicles. While past projects have primarily been wheeled vehicles, David recently completed a project conceptualizing a successor to the Sikorsky UH-60 Black Hawk helicopter.

His octocopter simulation is exciting to see, mostly because it’s a good example of how an extremely complex system can be modeled with our technology—all electrical and mechanical components and subsystems were built completely with SystemModeler.

Peter Nilsson

Peter Nilsson

Peter is one of the more unusual recipients of this year’s award, which in turn makes him incredibly interesting. Far from the typical English teacher, Peter organized the very first high-school digital humanities course using the Wolfram Language.

The course starts off having students analyze Hamlet using our text analysis features. This same analysis is then applied to the students’ own writing, allowing them to see their progression through the course and compare and contrast their own writing style with that of Shakespeare.

Peter is also the director of research, innovation and outreach; he has been involved in many efforts to try and capture the knowledge contained in the practice of teaching, as well as the pure content of teaching.

His background is in English and music, but looking at his code, you wouldn’t expect that. It just goes to show that because you started off learning traditionally “nontechnical” subjects doesn’t mean you can’t be as sophisticated a computational thinker as “officially educated” techies—computational thinking spans all disciplines, and Peter has effectively communicated this to his students through his teaching.

Chris Reed

Chris Reed

Dr. Reed is an applied mathematician who has worked on a variety of interesting projects in numerous disciplines at the Aerospace Corporation. Using our technology since 1988, Chris has introduced countless colleagues over the years to Mathematica at Aerospace, where it is now a staple piece of software for the company.

Interestingly, many of Chris’s projects have involved algebraic computations where traditionally a numerical approach would be chosen for the job—we think this is a testament to the unique approach that a symbolic language like the Wolfram Language offers when problem solving.

Chris has attended many Wolfram Tech Conferences over the years, using the Wolfram Language to analyze a variety of interesting problems—including problems with satellite motion: he found a way to change their orbit using much less fuel than traditionally required, translating to potentially larger payloads. Additionally, he has used the Wolfram Language to create queuing simulations and management systems for other companies.

Tarkeshwar Singh

Tarkeshwar Singh

Dr. Singh works for Quiet Light Securities, a proprietary trading company out of Chicago that trades in index options and other markets. When the company first got off the ground, trading was still a very physical activity, with people jumping into trading pits and gesticulating wildly to indicate whether they were buying or selling, so it’s interesting to see the timeline of Quiet Light as it has evolved into the now-computational world of trading.

This evolution, spearheaded by Dr. Singh, includes the automation of operations at Quiet Light using the Wolfram Language and the Computable Document Format (CDF). Going forward, Dr. Singh is working to build an automated trading system—completely from scratch.

Dr. Singh himself has a wonderfully eclectic background: with a PhD in quantum electronics, an MS in financial mathematics and an MBA, it’s not surprising that he’s been able to do such extraordinary things with our technology. In addition to all of these accolades, he’s also a major in the Illinois Air National Guard who deployed to Qatar late last year. Dr. Singh is the perfect example of a Wolfram power user—having used our technology for more than 20 years.

Marco Thiel

Marco Thiel

Well known to people who frequent Wolfram Community, Dr. Thiel is the author of a huge number of fascinating contributions to the forum—his posts cover some really cool stuff on many different topics. One of his posts that quickly rose to popularity on the site was over the study of the Ebola outbreak from 2015. Another popular post of his detailed—using 20 years’ worth of oceanographic data— how the flow of water could transport radioactive particles from the Fukushima nuclear plant to further out in the ocean.

It’s also interesting to note that Dr. Thiel has been using the Wolfram Language in very far-reaching ways—ranging from legal analyses to medical-oriented applications—and he is keen on teaching his students how to do the same: his modeling course aims to teach students how to use real-world data and the Wolfram Language to connect what they know from other courses, effectively producing miniature versions of projects that Dr. Thiel would do himself.

Andrew Yule

Andrew Yule

Assured Flow Solutions (AFS) is a Dallas-based specialty engineering firm within the oil and gas industry. Specifically, they investigate the problem of flow assurance: once you’ve drilled an oil well, how can you ensure oil is actually flowing out of that well?

This might seem like a niche question, but as anyone from Texas will tell you, it’s a very critical question that lies at the heart of the state’s infrastructure and can have a lasting and direct economic impact on consumers. If you drive a car, then you’ve undoubtedly been affected by the work Andrew does: AFS is responsible for keeping oil flowing in many of the world’s major oil fields and wells, especially offshore sites deep in the ocean.

Andrew is the technology manager at AFS, and has been centralizing the computations done in-house—computations that used to be done with a hodgepodge of various tools and methods. Through EnterpriseCDF, Andrew was able to create about 40 unique dashboards for different calculations and workflows that analyze aspects of flow assurance used to keep the oil flowing in different parts of the world.

And That’s a Wrap!

The Innovator Awards are a way to pay homage to people who do great things using Wolfram technology: each recipient has leveraged the language to do exciting things in their community, and it’s through their work that we really see how powerful a tool the Wolfram Language is.

A Note

While we are no doubt happy to announce the recipients of the 2017 Wolfram Innovator Awards, I want to take a moment to highlight the fact that it’s not as diverse a list as it could be—not a single woman was presented with an Innovator Award this year, and none of the winners from any year were under the age of 35. This is due to a variety of factors, but it would be remiss of me to ignore this reality—especially when you can so easily see it.

In past years, we’ve had diverse lists of innovators that better represented the broad spectrum of users that accomplish great things with our technology. Which is why, going forward, we hope to highlight the efforts of innovators from varying backgrounds and give them a platform to showcase their achievements. There are so many people out there using the Wolfram Language for interesting things, and we want the rest of the world to see that!

To read a more in-depth account of the scope of each individual project, visit the Wolfram Innovator Awards website.

Комментарии (0)

9. Inside Scoops from the 2017 Wolfram Technology ConferenceСр., 01 нояб.[−]

Wolfram Technology Conference

Two weeks ago at the Wolfram Technology Conference, a diverse lineup of hands-on training, workshops, talks and networking events were impressively orchestrated over the course of four days, culminating in a one-of-a-kind annual experience for users and enthusiasts of Wolfram technologies. It was a unique experience where researchers and professionals interacted directly with those who build each component of the Wolfram technology stack— Mathematica, Wolfram|Alpha, the Wolfram Language, Wolfram SystemModeler, Wolfram Enterprise Private Cloud and everything in between.

Users from across disciplines, industries and fields also interacted with one another to share how they use Wolfram technologies to successfully innovate at their institutions and organizations. It was not uncommon for software engineers or physicists to glean new tricks and tools from a social scientist or English teacher—or vice versa—a testament to the diversity and wide range of cutting-edge uses Wolfram technologies provide.

A Brief Data Analysis of the Conference

Attendees traveled from 18 countries for the experience, representing fields from mathematical physics to education and curriculum development.

Conference attendee map

One hundred thirty-nine talks were divided into five broad tracks: Information and Data Science; Education; Cloud and Software Development; Visualization and Image Processing; and Science, Math and Engineering.
Presentation topic chart

We can take a look at talk abstracts by track using WordCloud. See if you can guess which ones they correspond to.

Word clouds 1

If you guessed Data Science, Science/Math/Engineering, Cloud/Software Development, Education, and Visualization/Image Processing from left to right by row, you have a keen eye.

We can also look at all talk abstracts divided into nouns, verbs, adjectives and adverbs. Perhaps a Wolfram Technology Conference abstract generator could be built upon this.

Word clouds 2

It was most impressive to see those at the conference who use the Wolfram Language for data science, education or medical image processing be able to ask questions directly to the R&D experts and software developers who make those tools possible for them. As Stephen Wolfram said, “It’s fun to build these things, but it’s perhaps even more fun to see the cool ways people use it.”

A highlight of the conference was the keynote dinner in honor of the 2017 Wolfram Innovator Award winners. Nine individuals and organizations from finance, education, oil and gas, applied research, academia and engineering were represented. We’ll have more on these outstanding individuals in a forthcoming blog post. For now, a tease of where they came from.

Award winner map

Hands-on Workshops

The conference kicked off with hands-on training, where attendees received individualized instruction on how to use the Wolfram Language for their research and professional projects, led by experts in Wolfram technologies for data science and statistical analysis.

Hands-on training session

Abrita Chakravarty, team lead of the Wolfram Technology Group, guided participants through a deep dive into data science with a morning session focused on project workflows, followed by an afternoon workshop devoted to data wrangling and analytical methods. Among the Wolfram Language functions highlighted were FeatureExtraction, DimensionReduce, Classify, Predict and more of Wolfram’s sophisticated machine learning algorithms for highly automated data science applications.

Take a look at Etienne Bernard’s (lead architect in Wolfram’s Advanced Research Group) recent blog post “ Building the Automated Data Scientist: The New Classify and Predict” for further explanations of new machine learning features released in Wolfram Language Version 11.2. You can also view a livestream of Etienne demonstrating these features on Twitch.

In addition to hands-on training in data science, Tuseeta Banerjee, a Wolfram certified instructor, led a morning workshop on applied statistical analysis with the Wolfram Language. From hypothesis testing using DistributionFitTest to automated modeling using GeneralizedLinearModelFit, among many other functions, attendees were given the tools necessary to tell a complete analytical story from exploratory analysis and descriptive statistics to predictive analytics and visualization.

Wolfram U has on-demand courses available in data science, statistics, programming and other domains if you’re interested in learning how to use cutting-edge methods and the largest collection of built-in algorithms available in the Wolfram tech stack.

Stephen Wolfram’s Keynote Address

A highlight of the conference was Stephen Wolfram’s annual keynote talk, which covered an incredible amount of information over two and a half hours of live demonstrations in the Wolfram Language.

Celebrating 30 years of R&D at Wolfram Research since the company was founded in 1987, Stephen noted that’s about half the time since modern computer languages were invented. Next year, Wolfram celebrates 30 years of Mathematica—it’s fairly rare for software to remain so widely used, but the sheer amount of innovation that has gone into the product ensures its longevity.

Stephen highlighted some of the many new features in Wolfram Language Version 11.2 and noted ImageIdentify, which was announced in 2015, is at once a pioneer in general-purpose image identification neural networks and still paving the way as a building block in new Wolfram technologies. The neural net has been trained so well it can identify a jack o’ lantern carved in the fashion of Stephen Wolfram’s Wolfram|Alpha person curve.

Jack o' lantern

From there, Stephen touched on everything from cloud and mobile apps to blockchain, with some examples of their uses for individuals, organizations and enterprise. It was a fast-paced, quickfire presentation that covered two Wolfram Language version releases (11.1 and 11.2), the Wolfram Data Repository, SystemModeler 5, Wolfram Player for iOS, Wolfram|Alpha Enterprise and a slew of upcoming functionalities on the horizon. He also hinted at how some large and well-known companies are using the Wolfram tech stack in ways that allow millions of people to interact with them on a daily basis.

Word cloud 3

Stephen told the audience about ongoing efforts in K–12 education and Wolfram Programming Lab, along with resources available to individuals of all ages and organizations of all sizes to learn more about how to use the Wolfram tech stack in their work and projects.

He also pointed to his recent livestreams, open and accessible to anyone, of Wolfram Language design review meetings—a rare glimpse into how software is actually made. You can view the collection of on-demand videos here.

Wolfram R&D Expert Panel

Each year at the Tech Conference, a panel of Wolfram R&D experts preview what’s new in Wolfram technologies and what’s to come. This gives attendees the unique opportunity to learn how Wolfram technologies are made, along with the ability to ask the people who pave the way of innovation at Wolfram questions about future functionalities. This year’s panel included:

  • Arnoud Buzing, Director of Quality and Release Management
  • John Fultz, Director of User Interface Technology
  • Roger Germundsson, Director of Research and Development
  • Tom Wickham-Jones, Director of Kernel Technology

Roger gave an overview of the hundreds of new functions introduced in Wolfram Language 11.2, along with hundreds more improved functions that are continually in development.

Expert panel

Perhaps one of the biggest highlights was a preview of the Wolfram Neural Net Repository, which provides a uniform system for storing neural net models in an immediately computable form.

Neural Net Repository

Including models from the latest research papers, as well as ones trained or created at Wolfram Research, the Wolfram Neural Net Repository is built to be a global resource for neural net models. Classification, image processing, feature extraction and regression are just a few clicks away using the Wolfram tech stack.

Let’s look at some of the more creative uses of the Wolfram Language presented at talks during the conference.

Creative Highlight Number 1: Marathon Viewer

Jeff Bryant and Eila Stiegler demonstrated how the Wolfram Language can be used to analyze races and marathons, using the Illinois Marathon as an example.
Illinois marathon animation

Using data from the race, they showed how using functions like Interpreter can make the pain of wrangling and cleaning data easier. Jeff and Eila were able to take the data and create an animation that shows each runner’s progress through the marathon, with dynamics that indicate volumes of runners at any given time and location. Not only is this incredibly useful for people virtually tracking the progress of runners, but it also has applications for city and urban planning.

Creative Highlight 2: Building an Interactive Game Modeled on Jeopardy!

Robert Nachbar, project director with Wolfram Solutions, demonstrated an interactive game of Jeopardy! built in the Wolfram Language that he modestly said took him about a weekend to build.

Jeopardy! game

Attendees were impressed with its functionality and excited about its application in education. Using built-in Wolfram Language functions like Dynamic and interactive buttons, Robert showed how an API call can be used to create a game of Jeopardy! with existing clues or how a custom game can be built. To demonstrate, he used clues and questions specific to Wolfram Language documentation. Toward the end of the presentation, a brief game was played providing clues to Wolfram Language functions that the audience could then respond to, providing a nice model for learning any topic one might think to program into the game.

Creative Highlight 3: Food Data in Wolfram|Alpha

Andrew Steinacher, developer in Wolfram|Alpha scientific content, gave his third talk in as many years on food and nutrition data in Wolfram|Alpha. The Wolfram Language now has nearly 150,000 searchable (and computable) foods. New computable features include PLU codes, used by grocery stores worldwide, and acidity levels, full nutritional information, ingredients and substitutions, along with barcode recognition for better alignment with international foods.

Wolfram|Alpha food

Future goals for food data in the Wolfram Language include better food and nutrition coverage for the rest of the world, specifically Asia; adding more packaged foods and more available data, such as storage temperatures, packaging dimensions and materials; aligning ingredient entities to chemical entities; new FDA nutrition labels with support for multiple sizes/styles; and computational recipes, including food quantities and nutrition, actions, equipment and substitutions. One can easily imagine how these tools will certainly innovate the food production and food service industries.

Creative Highlight 4: Presenting Presenter Tools

In a something of a meta-talk, Andre Kuzniarek, director of Document and Media Systems, gave a presentation of Presenter Tools, an upcoming feature of Wolfram desktop products. In his talk, he showed how talks created in Wolfram Notebooks can be prepared and presented with convenient formatting tools and dynamic content scaling to match any screen resolution. While some of this functionality already exists in the Wolfram Language, this improved framework elevates presentations to a new level of aesthetics and interactivity.

Wolfram Livecoding Championship

A fun evening highlight of the conference was the Wolfram Livecoding Championship led by Stephen Wolfram. For the contest, Stephen gave challenges to the participants, and they were then tasked with finding a solution to the problem using an elegant piece of Wolfram Language code.

Approximately 20 participants took part in the contest and responded to challenges ranging from finding digit sequences in π to string manipulation to finding the earliest 2016 sunrise in Champaign, Illinois.

Jon McLoone, director of Technical Communication and Strategy at Wolfram Research Europe, took home the prize for the most solved challenges.

Livecoding Championship

The event was streamed live from Wolfram Research and Stephen Wolfram’s Twitch channels, and you can watch the video-on-demand here.

Wolfram Language Logo ImageRestyle Competition

Wolfie restyle samples

This year, a new contest was introduced to see who could use ImageRestyle, new in Wolfram Language Version 11.2, to generate the most creative and interesting version of the Wolfram Language logo. Over 70 entries were received, and contestants were required to use the logo and another image or images to come up with a new machine-generated version of “Wolfie.”

This year’s winner was Emmanuel Garces Madina for the following submission.

Wolfie restyle winner

More Wolfram Technology Conference Posts to Come

Chris Carlson, senior user interface developer, will present a recap of this year’s One-Liner Competition. Also, technical writer Jesse Dohmann will introduce this year’s winners of the Wolfram Technology Innovator Awards.

Комментарии (1)

10. Building the Automated Data Scientist: The New Classify and PredictВт., 10 окт.[−]

Automated Data Science

Imagine a baker connecting a data science application to his database and asking it, “How many croissants are we going to sell next Sunday?” The application would simply answer, “According to your recorded data and other factors such as the predicted weather, there is a 90% chance that between 62 and 67 croissants will be sold.” The baker could then plan accordingly. This is an example of an automated data scientist, a system to which you could throw arbitrary data and get insights or predictions in return.

One key component in making this a reality is the ability to learn a predictive model without specifications from humans besides the data. In the Wolfram Language, this is the role of the functions Classify and Predict. For example, let’s train a classifier to recognize morels from hedgehog mushrooms:

c = Classify[{

We can now use the resulting ClassifierFunction on new examples:



And we can obtain a probability for each possibility:

As another example, let’s train a PredictorFunction to predict the average monthly temperature for some US cities:

data = RandomSample[ResourceData["Sample Data: US City Temperature"]]

p = Predict[data ->

Again, we can use the resulting function to make a prediction:


And we can obtain a distribution of predictions:

dist = p[<|

As can you see, Classify and Predict do not need to be told what the variables are, what preprocessing to perform or which algorithm to use: they are automated functions.

New Classify and Predict

We introduced Classify and Predict in Version 10 of the Wolfram Language (about three years ago), and have been happy to see it used in various contexts (my favorite involves an astronaut, a plane and a Raspberry Pi). In Version 11.2, we decided to give these functions a complete makeover. The most visible update is the introduction of an information panel in order to get feedback during the training:

Classify progress animation

With it, one can monitor things such as the current best method and the current accuracy, and one can get an idea of how long the training will be—very useful in deciding if it is worth continuing or not! If one wants to stop the training, there are two ways to do it: either with the Stop button or by directly aborting the evaluation. In both cases, the best model that Classify and Predict came up with so far is returned (but the Stop interruption is softer: it waits until the current training is over).

A similar panel is now displayed when using ClassifierInformation and PredictorInformation on a classifier or a predictor:

Classify set 1

We tried to show some useful information about the model, such as its accuracy (on a test set), the time it takes to evaluate new examples and its memory size. More importantly, you can see a “learning curve” on the bottom that shows the value of the loss (the measure that one is trying to minimize) as a function of the number of examples that have been used for training. By pressing the left/right arrows, one can also look at other curves, such as the accuracy as a function of the number of training examples:

Classify set 2

Such curves are useful in figuring out if one needs more data to train on or not (e.g. when the curves are plateauing). We hope that giving easy access to them will ease the modeling workflow (for example, it might reduce the need to use ClassifierMeasurements and PredictorMeasurements).

An important update is the addition of the TimeGoal option, which allows one to specify how long one wishes the training to take, e.g:

c = Classify[{1, 2, 3, 4} -> {


TimeGoal has a different meaning than TimeConstraint: it is not about specifying a maximum amount of time, but really a goal that should be reached. Setting a higher time goal allows the automation system to try additional things in order to find a better model. In my opinion, this makes TimeGoal the most important option of both Classify and Predict (followed by Method and PerformanceGoal).

On the method side, things have changed as well. Each method now has its own documentation page ( "LogisticRegression", "NearestNeighbors", etc.) that gives generic information and allows experts to play with the options that are described. We also added two new methods: "DecisionTree" and, more noticeably, "GradientBoostedTrees", which is a favorite of data scientists. Here is a simple prediction example:

data = # -> Sin[2 #] + Cos[#] + RandomReal[] & /@ RandomReal[10, 200];

p = Predict[data, Method ->
Show[ListPlot[List @@@ data, PlotStyle -> Gray, PlotLegends -> {

Under the Hood…

OK, let’s now get to the main change in Version 11.2, which is not directly visible: we reimplemented the way Classify and Predict determine the optimal method and hyperparameters for a given dataset (in a sense, the core of the automation). For those who are interested, let me try to give a simple explanation of how this procedure works for Classify.

A classifier needs to be trained using a method (e.g. "LogisticRegression", "RandomForest", etc.) and each method needs to be given some hyperparameters (such as "L2Regularization" or "NeighborsNumber"). The automation procedure is there to figure out the best configuration (i.e. the best method + hyperparameters) to use according to how well the classifier (trained with this configuration) performs on a test set, but also how fast or how small in memory the classifier is. It is hard to know if a given configuration would perform well without actually training and testing it. The idea of our procedure is to start with many configurations that we believe could perform well (let’s say 100), then train these configurations on small datasets and use the information gathered during these “experiments” to predict how well the configurations would perform on the full dataset. The predictions are not perfect, but they are useful in selecting a set of promising configurations that will be trained on larger datasets in order to gather more information (you might notice some similarities with the Hyperband procedure). This operation is repeated until only a few configurations (sometimes even just one) are trained on the full dataset. Here is a visualization of the loss function for some configurations (each curve represents a different one) that underwent this operation:
Training graph

As you can see, many configurations have been trained on 10 and 40 examples, but just a few of them on 200 examples, and only one of them on 800 examples. We found in our benchmarks that the final configuration obtained is often the optimal one (among the ones present in the initial configuration set). Also, since training on smaller datasets is faster, the time needed for the entire procedure is not much greater than the time needed to train one configuration on the full dataset, which, as you can imagine, is much faster than training all configurations on the full dataset!

Besides being faster than the previous version, this automation strategy was necessary to bring some of the capabilities that I presented above. For example, the procedure directly produces an estimation of model performances and learning curves. Also, it enables the display of a progress bar and quickly produces valid models that can be returned if the Stop button is pressed. Finally, it enables the introduction of the TimeGoal option by adapting the number of intermediate trainings depending on the amount of time available.

We hope that you will find ways to use this new version of Classify and Predict. Don’t hesitate to give us feedback. The road to a fully automated data scientist is still long, but we’re getting closer!

Download this post as a Computable Document Format (CDF) file. New to CDF? Get your copy for free with this one-time download.

Комментарии (2)

Каталог RSS-каналов (лент) — RSSfeedReader
Всего заголовков: 10
По категориям:
Все заголовки
Computational Thinking (2)
Data Analysis and Visualization (2)
Developer Insights (1)
Education (2)
Events (2)
Machine Learning (1)
Mathematics (1)
New Technology (1)
Recreational Computation (2)
Uncategorized (2)
Wolfram Language (6)
По датам:
Все заголовки
2017-12-14, Чт. (1)
2017-12-07, Чт. (1)
2017-11-30, Чт. (1)
2017-11-20, Пн. (1)
2017-11-14, Вт. (1)
2017-11-09, Чт. (1)
2017-11-08, Ср. (1)
2017-11-02, Чт. (1)
2017-11-01, Ср. (1)
2017-10-10, Вт. (1)
По авторам:
Все заголовки
Christopher Carlson (1)
Devendra Kapadia (1)
Etienne Bernard (1)
Jesse Dohmann (1)
Jon McLoone (2)
Michael Gammon (1)
Stephen Wolfram (1)
Swede White (1)
Vitaliy Kaurov (1)