Thursday, July 20, 2017

Politics 101

The Supreme Court and the Law of Motion

Back in the late ’90s, a Supreme Court case on the validity of a police search produced five separate opinions and left considerable doubt about who would ultimately benefit from the decision, prosecutors or criminal defendants. In our stories the next day, my opposite number at The Washington Post and I described the decision, Minnesota v. Carter, quite differently. At a reception a few days later, we encountered Justice Stephen G. Breyer, who had written one of the separate opinions. We told him about our confusion and differing interpretations.

Which of us was right, we asked. Who actually won the case?

“I don’t know,” Justice Breyer replied.

“How can you not know?” I pushed back.

The justice replied: “It depends on what the lower courts make of it.”

At first, that answer struck me as coy, even evasive, but the more I thought about it — and I’ve actually thought about it over the years — the more accurate, even profound, it has come to appear. Ours is a common-law system, in which one case follows another and legal doctrine emerges from the crucible of decided cases. Accepting only about 65 cases a year, the Supreme Court sits on the top of a very big pyramid: thousands of cases pass through the system every year, and judges are tasked with finding and applying relevant Supreme Court precedents to new cases with facts that differ, slightly or quite a lot from those in the original case.

I thought about the conversation with Justice Breyer as I read a decision the Supreme Court issued on the final day of its term. In Trinity Lutheran Church v. Comer, the court ruled that a church had a constitutional right to be considered on the same basis as secular institutions for a state grant to improve the safety of its preschool playground. The decision, with a majority opinion by Chief Justice John G. Roberts Jr., was a sharp departure from the line the Supreme Court has maintained against the direct funding of churches (the church had described the preschool as part of its religious mission, so there was no dispute in the case that the school fully shared the church’s identity).

Like most other states, Missouri, where the case arose, has a constitutional prohibition against public money going to churches. Invoking that provision, Missouri had deemed Trinity Lutheran categorically ineligible for the playground grant, although its preschool otherwise met the specified criteria. That exclusion put the church “to the choice between being a church and receiving a government benefit,” Chief Justice Roberts wrote. The court held that the exclusion amounted to discrimination against religion that was “odious to our Constitution,” a violation of the church’s First Amendment right to the free exercise of religion.

Strong words and very broad implications, but the court stopped short of taking the obvious next step of declaring unconstitutional not only Missouri’s handling of a specific grant program, but also the state constitutional provision on which the state’s action was based. Instead of playing out the implications of its decision, the majority opinion contained a footnote: “This case involves express discrimination based on religious identity with respect to playground resurfacing. We do not address religious uses of funding or other forms of discrimination.”

To call this footnote odd is an understatement. The Supreme Court doesn’t usually summon the majestic, if opaque, phrases of the Constitution to swat a mouse (in this case, a program to use recycled tires as a playground surface). Of the seven justices in the majority (only Justices Sonia Sotomayor and Ruth Bader Ginsburg dissented), only four subscribed to the footnote: Justices Anthony M. Kennedy, Samuel A. Alito Jr., and Elena Kagan, in addition to the chief justice. Justice Breyer concurred only in the judgment, so the footnote was not part of an opinion that he signed. And Justices Clarence Thomas and Neil M. Gorsuch objected that the footnote narrowed the holding of the case unrealistically.

So the footnote doesn’t even speak for a majority of the nine-member court. Chief Justice Roberts probably added it to satisfy the demand of Justice Kagan or Justice Kennedy, or both, for some limitation on the decision; it was crucial to hold them, even at the price of alienating Justices Gorsuch and Thomas, who were going to stay with the majority in all other respects. No matter what the footnote meant when the decision was issued on June 26, the question now is: What will it mean in the future? And that brings me back to Justice Breyer’s long-ago answer: It depends on what the lower courts make of it.

History offers a lesson: There is a momentum to Supreme Court decisions, and efforts to cabin the logical progression of legal doctrine will fail if the political and cultural forces that led to the doctrine in the first place remain in play. It’s Newton’s law of motion in the legal context: A doctrine in motion will stay in motion unless met by an outside force — a backlash or a change of cast. The steps can be small, but they can add up to giant steps.

by Linda Greenhouse, NY Times |  Read more:
Image: Stephen Crowley

Dave McKean, Cover art for issue #11 of The Sandman

Wednesday, July 19, 2017

Kurt Vonnegut Walks Into a Bar

I was on the corner of Third Avenue and Forty-eighth Street, and Kurt Vonnegut was coming toward me, walking his big, loose-boned walk. It was fall and turning cold and he looked a little unbalanced in his overcoat, handsome but tousled, with long curly hair and a heavy mustache that sometimes hid his grin. I could tell he saw me by his shrug, which he sometimes used as a greeting.

I was on my way to buy dinner for some Newsweek writers who were suspicious of me as their new assistant managing editor. I had been brought in from Rolling Stone, and no one at Newsweek had heard of me. I didn’t know them either, but I knew Kurt, who was one of the first people I met when I moved to New York. We were neighbors on Forty-eighth Street, where he lived in a big townhouse in the middle of the block, and he’d invite me over for drinks. I had gotten him to contribute to Rolling Stone by keeping an eye out for his speeches and radio appearances and then suggesting ways they could be retooled as essays.

“Come have dinner,” I said. “I’ve got some Newsweek writers who would love to meet you.”

“Not in the mood,” Kurt said.

“They’re fans,” I said. “It’s part of your job.”

Kurt lit a Pall Mall and gave me a look, one of his favorites, amused but somehow saddened by the situation. He could act, Kurt.

“Think of it as a favor to me,” I said. “They’re not sure about me, and I’ve edited you.”

“Sort of,” he said, and I knew he had already had a couple drinks.

He never got mean, but he got honest.

“What else are you doing for dinner?” I said, knowing he seldom made plans.

“The last thing I need is ass kissing,” Kurt said.

“That’s what I’m doing right now.”

“They’ll want to know which novel I like best.”

Cat’s Cradle,” I said.

“Wrong.” He flipped the Pall Mall into Forty-eighth Street, and we started walking together toward the restaurant.

The writers were already at the table, drinks in front of them. They looked up when we came in, surprised to see Kurt with me. There were six or eight of them, including the columnist Pete Axthelm, who was my only ally going into Newsweek because I knew him from Runyon’s, a bar in our neighborhood where everyone called him Ax.

I introduced Kurt around.

“Honored,” Ax said, or something like that, and the ass kissing began.

by Terry McDonell, Electric Lit | Read more:
Image: uncredited

The Limitations of Deep Learning

Deep learning: the geometric view

The most surprising thing about deep learning is how simple it is. Ten years ago, no one expected that we would achieve such amazing results on machine perception problems by using simple parametric models trained with gradient descent. Now, it turns out that all you need is sufficiently large parametric models trained with gradient descent on sufficiently many examples. As Feynman once said about the universe, "It's not complicated, it's just a lot of it".

In deep learning, everything is a vector, i.e. everything is a point in a geometric space. Model inputs (it could be text, images, etc) and targets are first "vectorized", i.e. turned into some initial input vector space and target vector space. Each layer in a deep learning model operates one simple geometric transformation on the data that goes through it. Together, the chain of layers of the model forms one very complex geometric transformation, broken down into a series of simple ones. This complex transformation attempts to maps the input space to the target space, one point at a time. This transformation is parametrized by the weights of the layers, which are iteratively updated based on how well the model is currently performing. A key characteristic of this geometric transformation is that it must be differentiable, which is required in order for us to be able to learn its parameters via gradient descent. Intuitively, this means that the geometric morphing from inputs to outputs must be smooth and continuous—a significant constraint.

The whole process of applying this complex geometric transformation to the input data can be visualized in 3D by imagining a person trying to uncrumple a paper ball: the crumpled paper ball is the manifold of the input data that the model starts with. Each movement operated by the person on the paper ball is similar to a simple geometric transformation operated by one layer. The full uncrumpling gesture sequence is the complex transformation of the entire model. Deep learning models are mathematical machines for uncrumpling complicated manifolds of high-dimensional data.

That's the magic of deep learning: turning meaning into vectors, into geometric spaces, then incrementally learning complex geometric transformations that map one space to another. All you need are spaces of sufficiently high dimensionality in order to capture the full scope of the relationships found in the original data.

The limitations of deep learning

The space of applications that can be implemented with this simple strategy is nearly infinite. And yet, many more applications are completely out of reach for current deep learning techniques—even given vast amounts of human-annotated data. Say, for instance, that you could assemble a dataset of hundreds of thousands—even millions—of English language descriptions of the features of a software product, as written by a product manager, as well as the corresponding source code developed by a team of engineers to meet these requirements. Even with this data, you could not train a deep learning model to simply read a product description and generate the appropriate codebase. That's just one example among many. In general, anything that requires reasoning—like programming, or applying the scientific method—long-term planning, and algorithmic-like data manipulation, is out of reach for deep learning models, no matter how much data you throw at them. Even learning a sorting algorithm with a deep neural network is tremendously difficult.

This is because a deep learning model is "just" a chain of simple, continuous geometric transformations mapping one vector space into another. All it can do is map one data manifold X into another manifold Y, assuming the existence of a learnable continuous transform from X to Y, and the availability of a dense sampling of X:Y to use as training data. So even though a deep learning model can be interpreted as a kind of program, inversely most programs cannot be expressed as deep learning models—for most tasks, either there exists no corresponding practically-sized deep neural network that solves the task, or even if there exists one, it may not be learnable, i.e. the corresponding geometric transform may be far too complex, or there may not be appropriate data available to learn it.

Scaling up current deep learning techniques by stacking more layers and using more training data can only superficially palliate some of these issues. It will not solve the more fundamental problem that deep learning models are very limited in what they can represent, and that most of the programs that one may wish to learn cannot be expressed as a continuous geometric morphing of a data manifold.

The risk of anthropomorphizing machine learning models

One very real risk with contemporary AI is that of misinterpreting what deep learning models do, and overestimating their abilities. A fundamental feature of the human mind is our "theory of mind", our tendency to project intentions, beliefs and knowledge on the things around us. Drawing a smiley face on a rock suddenly makes it "happy"—in our minds. Applied to deep learning, this means that when we are able to somewhat successfully train a model to generate captions to describe pictures, for instance, we are led to believe that the model "understands" the contents of the pictures, as well as the captions it generates. We then proceed to be very surprised when any slight departure from the sort of images present in the training data causes the model to start generating completely absurd captions. (...)

Humans are capable of far more than mapping immediate stimuli to immediate responses, like a deep net, or maybe an insect, would do. They maintain complex, abstract models of their current situation, of themselves, of other people, and can use these models to anticipate different possible futures and perform long-term planning. They are capable of merging together known concepts to represent something they have never experienced before—like picturing a horse wearing jeans, for instance, or imagining what they would do if they won the lottery. This ability to handle hypotheticals, to expand our mental model space far beyond what we can experience directly, in a word, to perform abstraction and reasoning, is arguably the defining characteristic of human cognition. I call it "extreme generalization": an ability to adapt to novel, never experienced before situations, using very little data or even no new data at all.

This stands in sharp contrast with what deep nets do, which I would call "local generalization": the mapping from inputs to outputs performed by deep nets quickly stops making sense if new inputs differ even slightly from what they saw at training time. Consider, for instance, the problem of learning the appropriate launch parameters to get a rocket to land on the moon. If you were to use a deep net for this task, whether training using supervised learning or reinforcement learning, you would need to feed it with thousands or even millions of launch trials, i.e. you would need to expose it to a dense sampling of the input space, in order to learn a reliable mapping from input space to output space. By contrast, humans can use their power of abstraction to come up with physical models—rocket science—and derive an exact solution that will get the rocket on the moon in just one or few trials. Similarly, if you developed a deep net controlling a human body, and wanted it to learn to safely navigate a city without getting hit by cars, the net would have to die many thousands of times in various situations until it could infer that cars are dangerous, and develop appropriate avoidance behaviors. Dropped into a new city, the net would have to relearn most of what it knows. On the other hand, humans are able to learn safe behaviors without having to die even once—again, thanks to their power of abstract modeling of hypothetical situations. (...)


Here's what you should remember: the only real success of deep learning so far has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data. Doing this well is a game-changer for essentially every industry, but it is still a very long way from human-level AI.

To lift some of these limitations and start competing with human brains, we need to move away from straightforward input-to-output mappings, and on to reasoning and abstraction. A likely appropriate substrate for abstract modeling of various situations and concepts is that of computer programs. We have said before (Note: in Deep Learning with Python) that machine learning models could be defined as "learnable programs"; currently we can only learn programs that belong to a very narrow and specific subset of all possible programs. But what if we could learn any program, in a modular and reusable way? Let's see in the next post what the road ahead may look like.

You can read the second part here: The future of deep learning.

by By Francois Chollet, Keras Blog | Read more:
Image: Getty

Long Strange Trip

Amir Bar-Lev’s rockumentary, Long Strange Trip, about the Grateful Dead, is aptly named for what is arguably the band’s most famous lyric: What a long, strange trip it’s been. The film takes you on a four-hour ride (much like the band's live shows) but this is not just another indulgent music doc.

Executive-produced by Martin Scorsese, the film digs deeply into the bizarre phenomenon that surrounded “The Dead” for decades—obsessive fans, called Deadheads, became a cult-like following that elevated the band’s ringmaster, Jerry Garcia (Aug. 1, 1942–Aug. 9, 1995), to a status he never wanted.

The must-see film includes 17 interviews, 1,100 rare photos and loads of footage you’ve never seen. Deadheads will be ecstatic. Bar-Lev doesn’t tell you what to think. Instead he offers many points of view. One theory is that the die-hard Deadheads were the major cause of Garcia’s descent into heroin. I didn’t buy that so I reached out to Grateful Dead insider Dennis McNally, whose book, A Long Strange Trip: The Inside History of the Grateful Dead, provided Bar-Lev with much of the band’s story. McNally spent 30 years with the band beginning when Garcia invited him to become their biographer in 1981.

When I asked McNally if he thought it was the Deadheads that drove Garcia to abuse heroin, or if he felt, as I do, that it was a progression from one addiction to another. McNally answered:

“I don’t think there’s an inherent progression [of addiction], I mean everybody starts with milk, too, you know? He turned to self-medication for any number of reasons…. His father died when he was four, he didn’t get the attention from his mother that he felt he deserved. Eventually, yes, but not specifically the fame. It was the responsibility. Jerry wanted to be Huckleberry Finn, well, if Huckleberry Finn was allowed to smoke joints and play guitar and cruise down the river on a raft.”

McNally pointed out that Garcia “employed 50 people, me among them. We expected paychecks every couple of weeks. There was a weight of responsibility on him for employees, their families, but also the million Deadheads who were addicted to the music and the shows. They expected him to come out and play 80 shows a year. That wore on him. He was 53-years-old and a walking candidate for a heart attack. Still smoked cigarettes, had a terrible diet. He was a diabetic who did not take it seriously.” (...)

The movie mimicked Garcia’s life. It began as a celebration but ended with a trip down the dark cellar of no return. Garcia probably didn’t set out to become a heroin addict. Maybe he just thought he could handle it. Or maybe, like my own drug use, just got to a point where he wanted out. Many people didn’t know the flip side of his jolly exterior was a dark depression.

The Dead’s casualties also included Ron “Pigpen” McKernan who drank himself to death in 1973 at age 27. Keith Godchaux died at age 32 in 1980 from a car crash after he and friend Courtenay Pollock had partied for hours to celebrate Godchaux’s birthday. The intoxicated driver—Pollock—survived the accident. Brent Mydland, mostly known as a drinker, died from a “speedball” (morphine and cocaine) in 1990. After Mydland’s death, keyboardist Vince Welnick joined the band but died in 2006 when he committed suicide. Phil Lesh’s drug use led to contracting hepatitis C. In the fall of 1998, his life was saved by a liver transplant.

Next I tracked down former president of Warner Bros. Records, Joe Smith, the executive who first signed the Dead. His presence brought a lot of fun into the film during the celebratory first half. “They were totally insane at times,” said Smith. “Trying to corral them was a very difficult thing, but we developed a friendship and Jerry Garcia was very bright. They were all bright. I established relationships with all of them, but not without difficulty because they didn’t want relationships. They were stoned most of the time. Phil Lesh was a particularly difficult guy.”

“How so?” I asked.

“He disputed and contested everything. One time, when I was trying to promote them, he said, ‘No. Let’s go out and record 30 minutes of heavy air on a smoggy day in L.A. Then we’re gonna record 30 minutes of clear air on a beautiful day, and we’ll mix it and that’ll be a rhythm soundtrack.”

by Dorri Olds, The Fix |  Read more:
Image: Michael Conway

Tuesday, July 18, 2017

Letting Robots Teach Schoolkids

For all the talk about whether robots will take our jobs, a new worry is emerging, namely whether we should let robots teach our kids. As the capabilities of smart software and artificial intelligence advance, parents, teachers, teachers’ unions and the children themselves will all have stakes in the outcome.

I, for one, say bring on the robots, or at least let us proceed with the experiments. You can imagine robots in schools serving as pets, peers, teachers, tutors, monitors and therapists, among other functions. They can store and communicate vast troves of knowledge, or provide a virtually inexhaustible source of interactive exchange on any topic that can be programmed into software.

But perhaps more important in the longer run, robots also bring many introverted or disabled or non-conforming children into greater classroom participation. They are less threatening, always available, and they never tire or lose patience.

Human teachers sometimes feel the need to bully or put down their students. That’s a way of maintaining classroom control, but it also harms children and discourages learning. A robot in contrast need not resort to tactics of psychological intimidation.

The pioneer in robot education so far is, not surprisingly, Singapore. The city-state has begun experiments with robotic aides at the kindergarten level, mostly as instructor aides and for reading stories and also teaching social interactions. In the U.K., researchers have developed a robot to help autistic children better learn how to interact with their peers.

I can imagine robots helping non-English-speaking children make the transition to bilingualism. Or how about using robots in Asian classrooms where the teachers themselves do not know enough English to teach the language effectively?

A big debate today is how we can teach ourselves to work with artificial intelligence, so as to prevent eventual widespread technological unemployment. Exposing children to robots early, and having them grow accustomed to human-machine interaction, is one path toward this important goal.

In a recent Financial Times interview, Sherry Turkle, a professor of social psychology at MIT, and a leading expert on cyber interactions, criticized robot education. “The robot can never be in an authentic relationship," she said. "Why should we normalize what is false and in the realm of [a] pretend relationship from the start?” She’s opposed to robot companions more generally, again for their artificiality.

Yet K-12 education itself is a highly artificial creation, from the chalk to the schoolhouses to the standardized achievement tests, not to mention the internet learning and the classroom TV. Thinking back on my own experience, I didn’t especially care if my teachers were “authentic” (in fact, I suspected quite a few were running a kind of personality con), provided they communicated their knowledge and radiated some charisma.  (...)

Keep in mind that robot instructors are going to come through toys and the commercial market in any case, whether schools approve or not. Is it so terrible an idea for some of those innovations to be supervised by, and combined with, the efforts of teachers and the educational establishment?

by Tyler Cowen, Bloomberg |  Read more:
Image: Nigel Treblin/Getty Images
[ed. See also: Give robots an 'ethical black box' to track and explain decisions]

Friday, July 14, 2017

Trump's Russian Laundromat

In 1984, a Russian √©migr√© named David Bogatin went shopping for apartments in New York City. The 38-year-old had arrived in America seven years before, with just $3 in his pocket. But for a former pilot in the Soviet Army—his specialty had been shooting down Americans over North Vietnam—he had clearly done quite well for himself. Bogatin wasn’t hunting for a place in Brighton Beach, the Brooklyn enclave known as “Little Odessa” for its large population of immigrants from the Soviet Union. Instead, he was fixated on the glitziest apartment building on Fifth Avenue, a gaudy, 58-story edifice with gold-plated fixtures and a pink-marble atrium: Trump Tower.

A monument to celebrity and conspicuous consumption, the tower was home to the likes of Johnny Carson, Steven Spielberg, and Sophia Loren. Its brash, 38-year-old developer was something of a tabloid celebrity himself. Donald Trump was just coming into his own as a serious player in Manhattan real estate, and Trump Tower was the crown jewel of his growing empire. From the day it opened, the building was a hit—all but a few dozen of its 263 units had sold in the first few months. But Bogatin wasn’t deterred by the limited availability or the sky-high prices. The Russian plunked down $6 million to buy not one or two, but five luxury condos. The big check apparently caught the attention of the owner. According to Wayne Barrett, who investigated the deal for the Village Voice, Trump personally attended the closing, along with Bogatin.

If the transaction seemed suspicious—multiple apartments for a single buyer who appeared to have no legitimate way to put his hands on that much money—there may have been a reason. At the time, Russian mobsters were beginning to invest in high-end real estate, which offered an ideal vehicle to launder money from their criminal enterprises. “During the ’80s and ’90s, we in the U.S. government repeatedly saw a pattern by which criminals would use condos and high-rises to launder money,” says Jonathan Winer, a deputy assistant secretary of state for international law enforcement in the Clinton administration. “It didn’t matter that you paid too much, because the real estate values would rise, and it was a way of turning dirty money into clean money. It was done very systematically, and it explained why there are so many high-rises where the units were sold but no one is living in them.” When Trump Tower was built, as David Cay Johnston reports in The Making of Donald Trump, it was only the second high-rise in New York that accepted anonymous buyers.

In 1987, just three years after he attended the closing with Trump, Bogatin pleaded guilty to taking part in a massive gasoline-bootlegging scheme with Russian mobsters. After he fled the country, the government seized his five condos at Trump Tower, saying that he had purchased them to “launder money, to shelter and hide assets.” A Senate investigation into organized crime later revealed that Bogatin was a leading figure in the Russian mob in New York. His family ties, in fact, led straight to the top: His brother ran a $150 million stock scam with none other than Semion Mogilevich, whom the FBI considers the “boss of bosses” of the Russian mafia. At the time, Mogilevich—feared even by his fellow gangsters as “the most powerful mobster in the world”—was expanding his multibillion-dollar international criminal syndicate into America.

In 1987, on his first trip to Russia, Trump visited the Winter Palace with Ivana. The Soviets flew him to Moscow—all expenses paid—to discuss building a luxury hotel across from the Kremlin.

Since Trump’s election as president, his ties to Russia have become the focus of intense scrutiny, most of which has centered on whether his inner circle colluded with Russia to subvert the U.S. election. A growing chorus in Congress is also asking pointed questions about how the president built his business empire. Rep. Adam Schiff, the ranking Democrat on the House Intelligence Committee, has called for a deeper inquiry into “Russian investment in Trump’s businesses and properties.”

The very nature of Trump’s businesses—all of which are privately held, with few reporting requirements—makes it difficult to root out the truth about his financial deals. And the world of Russian oligarchs and organized crime, by design, is shadowy and labyrinthine. For the past three decades, state and federal investigators, as well as some of America’s best investigative journalists, have sifted through mountains of real estate records, tax filings, civil lawsuits, criminal cases, and FBI and Interpol reports, unearthing ties between Trump and Russian mobsters like Mogilevich. To date, no one has documented that Trump was even aware of any suspicious entanglements in his far-flung businesses, let alone that he was directly compromised by the Russian mafia or the corrupt oligarchs who are closely allied with the Kremlin. So far, when it comes to Trump’s ties to Russia, there is no smoking gun.

But even without an investigation by Congress or a special prosecutor, there is much we already know about the president’s debt to Russia. A review of the public record reveals a clear and disturbing pattern: Trump owes much of his business success, and by extension his presidency, to a flow of highly suspicious money from Russia. Over the past three decades, at least 13 people with known or alleged links to Russian mobsters or oligarchs have owned, lived in, and even run criminal activities out of Trump Tower and other Trump properties. Many used his apartments and casinos to launder untold millions in dirty money. Some ran a worldwide high-stakes gambling ring out of Trump Tower—in a unit directly below one owned by Trump. Others provided Trump with lucrative branding deals that required no investment on his part. Taken together, the flow of money from Russia provided Trump with a crucial infusion of financing that helped rescue his empire from ruin, burnish his image, and launch his career in television and politics. “They saved his bacon,” says Kenneth McCallion, a former assistant U.S. attorney in the Reagan administration who investigated ties between organized crime and Trump’s developments in the 1980s.

It’s entirely possible that Trump was never more than a convenient patsy for Russian oligarchs and mobsters, with his casinos and condos providing easy pass-throughs for their illicit riches. At the very least, with his constant need for new infusions of cash and his well-documented troubles with creditors, Trump made an easy “mark” for anyone looking to launder money. But whatever his knowledge about the source of his wealth, the public record makes clear that Trump built his business empire in no small part with a lot of dirty money from a lot of dirty Russians—including the dirtiest and most feared of them all.

by Craig Unger, New Republic |  Read more:
Image: Alex Nabaum

Thursday, July 13, 2017