A. Jesse Jiryu Davis

Street Retreat Recap

A couple weeks ago I went on my third Zen street retreat, led by Roshi Grover Genro Gauntt. Here's my notes on what happened. Thursday We met in Washington Square Park in the afternoon to start the retreat. We had no wallets, cellphones, [...]

A couple weeks ago I went on my third Zen street retreat, led by Roshi Grover Genro Gauntt. Here's my notes on what happened.

Thursday

We met in Washington Square Park in the afternoon to start the retreat. We had no wallets, cellphones, toothbrushes, books. There were eleven of us. We sat in a circle and meditated. A little jazz combo was playing nearby. We introduced ourselves to each other. Most were on their first retreat, except for Genro, Eileen, me, and Batman. That's his nickname because he was an extra in a Batman movie once. He's a garbage man. He's a long-time student of Bernie Glassman and he's been homeless thirty years. He just got into an apartment this year.

Batman

Dinner at the Bowery Mission. We sat through a service first. Indifferent and nearly inaudible Christian guitar rock. We were supposed to sing along, but the lyrics were on a little screen at the front that no one could read, even if we cared to. Much of the congregation was asleep or reading newspapers.

They served beef stroganoff, steamed spinach, nectarines. Fagé yogurt containers were stacked on a table. I stuck with the spinach and a nectarine.

Outside were demolition dumpsters, related to a new New Museum expansion. They were filled with carpet remnants, cut into manageable sizes and conveniently rolled up. All eleven of us chose a roll each and attached them to our bags if possible, or carried them with us. We'd keep them for the rest of the retreat.

We walked to Masjid Farah, the sufi mosque on West Broadway. In past retreats we've wandered looking for it—no cellphones, after all—but Batman knows where everything is and he led us straight there. "Do you want to go the scenic way or the short way?" he asked each time I made a wrong turn.

The streets in the West Village were set up for a big film shoot. Ari and I noticed that we passed right by Fred Armisen standing on a corner talking to a member of the film crew.

The zikr ceremony at the mosque is always a trip. They sing and shout and chant Allah's name until it sounds like gasping. Unlike in past years, we stayed only a couple hours. When everyone got up to dance, we left.

Our plan was to find a place in Battery Park to sleep. We walked with our carpets down to the park. Along the Canyon of Heroes, Jonathan read the name and date of every celebrity ever given a ticker-tape parade in NYC.

Walking

We found the park almost entirely fenced off for construction. Weirdly, there was a wide opening in the fence, from which led a fenced corridor to a round, fenced area in the middle of the construction zone. We debated a little: was this illegal? Would we be rousted? Signs on the fence warned of rats and rat poison: were they any concern to us? We bedded down and slept.

Or tried to. I was too cold. The wind was up for much of the night and we were unsheltered. Mike had a T-shirt and his carpet and froze all night. Batman had come with a huge backpack and produced from it a yoga mat and a fancy sleeping bag; he slept well. I had my usual leather jacket and hoodie, and I'd packed a new wool army blanket that I bought for the retreat. I got to sleep a few hours despite the cold. The worst for me was that my blanket shed wool-lint into my eyes, while the carpet from the demolition debris puffed plaster dust up my nose every time I shifted. I was glad to finally hear the birds start to chirp, and to pull the blanket from my face and see the sky brightening.

(Note: Now that I've washed this blanket once, it's not only lint-free but several times softer. Lesson learned.)

Friday

We walked up to McAuley's (New York Rescue Mission) at 6 or so in the morning. Breakfast was a revolting little stack of ham slices and a few pancakes, I think, glued with corn syrup to the bottom of a styrofoam bowl. Strong coffee, though.

We walked to Washington Square Park. Along the way, Batman saw a food cart vendor he knew. He banged on the window and talked to the man a moment, then announced to us: "Anybody want coffee? Tell him how you like it! Step up and get some!" Batman is old friends with everyone.

In Washington Square Park, some sections were soaked, or were being soaked with big sprinklers. We found a dry spot and bedded down for a nap. Still cold, even with the sun up. A parks person came up in a buggy and called out, "Parks! What's going on here?"

Genro: "We're resting."

Parks: "How long are you going to be here?"

Genro: "Maybe an hour."

Parks: "Well okay, we can hold off watering this section for a bit. But just to let you know: water is coming."

Surprisingly accommodating.

Roshi Genro, Washington Square Park

Washington Square Park

Sam, Washington Square Park

Brunch at The Catholic Worker in the East Village, the usual homemade bread and vegetable soup, and good coffee. This is always my favorite meal on retreat. A bottle of hot sauce was on each table: actual flavor!

We walked up to Tompkins Square Park to nap, do meditation and service, and council.

Roshi Genro, Tompkins Square Park

Batman

Dinner at the Bowery Mission again. Friday nights they have a pickup jazz band; I've heard it once or twice before, but Friday night they were great. Some big-band flair and wild solos, playing their hearts out, shaking the walls. The drummer, Uros Markovic, has led this band for years. He alternates jazz pieces with bits of preaching. The rest of the band was a sax, a bass trombone, piano, and a woman on the tambourine. I forget her name, but we'd spent time with her on the last retreat, and Batman knows her very well. She seems to have cleaned up, calmed down, put on weight, and become a much better musician since the last time we saw her.

Markovic gave a sermon in which he criticized TV programs that purport to tell us about the real Jesus, that show him as an ordinary man. Jesus wasn't an ordinary man, after all, he was special. We know because the Bible tells us. Muslims get Jesus wrong, too: they don't believe he rose from the dead after three days. Those who know Jesus will live forever in Heaven. Everyone else goes to Hell.

The band did a jazz version of "Nothing but the Blood" by Robert Lowry, which is incredibly weird:

Glory! Glory! This I sing—
Nothing but the blood of Jesus,
All my praise for this I bring—
Nothing but the blood of Jesus.

Dinner was pasta, and some slices of meat that the servers were calling chicken but looked like ham. As is my habit on street retreat, I refused the non-vegetables, so I just had salad. "You want a double portion then? You like avocado? Here, let me put some dressing on. Is this enough? Tell me when! You want some grated cheese?" I love this kind of bustling generosity you find sometimes.

We stood outside the Mission, and a fight broke out between two men, a small black guy and one about my size. The crowd stood back to make room for them. A woman was yelling, egging them on as they assumed boxers' stances and shifted around the sidewalk. Across the street, some men at a bus stop cheered. I thought of intervening but doubted I was the right person for the job. The smaller man threw the first punch, knocked the other down, and raised his foot to stomp on his opponent's head. The man on the ground deflected the stomp with his hands. Batman and a few others rushed in to end the fight. Batman grabbed the winner and pushed him back, arguing with him to leave it be, it's over. He and Batman have been friends for years.

Where should we sleep? Returning to Battery Park would be a long walk in exchange for an uncomfortable night. The wind was up again, and the temperature down.

Batman had two recommendations. The first, an entryway to the Hebrew Union College at West 4th and Mercer. The police treat it like holy ground, Batman said, a refuge: if we stayed just off the sidewalk, in front of the door to the college, we'd be allowed to sleep in plain sight. The alternative was a park in an apartment complex a few more blocks to the south. We found there a big bench-like thing in the middle of the park, a spiral of concrete. It was perfect for 11 people. We walked into the spiral and put down our bedding, cozying up to the shelter of the bench. Residents were passing, walking their dogs, talking on their phones, glancing at us. I wondered what they thought, and if one of them would call the cops on us. And yet, we were out of the wind, and I fell asleep quickly.

Perhaps half an hour later, I heard someone slapping on the concrete. "Hello. Security." I pulled my blanket from my face. A pretty young black woman in a blue nylon jacket was standing outside the spiral. She said that she'd been called with complaints from residents about people sleeping in the park and we needed to move out. If we didn't, a lot more security people would come and it would be a big mess. She talked gently. "I need to stand here until you go." I asked if she knew anywhere else we could sleep. "I'm sorry, I'm not from around here."

We rolled up our carpets and walked a couple blocks to the Hebrew Union College entryway. It's a triangle, a few yards on a side, off the Mercer Street sidewalk. It was enough room for all of us if we slept side by side. Nearby was a small and very warm heating vent, continuously blowing out enough air to warm a half-dozen people standing around it. Batman says that vent is the tribal campfire. He has years of stories of things that have happened around that vent. We warmed up at the vent, then bedded down for the night. It was windier than in the spiral bench, and noisier and brighter, but Batman was right: no one bothered us all night long.

Hebrew Union College entryway

Batman rolling up his yoga mat

Saturday

We got to the University Community Soup Kitchen, a.k.a. Diane's, a.k.a. the Meatloaf Kitchen, for breakfast at 10am. We sat in rows of plastic chairs and ate open-faced peanut butter and jelly sandwiches on thick slices of white bread, with milky sweet coffee in styrofoam cups. (Homeless services must be the last great consumer of styrofoam in America.) Then we sat on the chairs and waited: lunch wasn't served until 1pm. The regulars who knew each other passed the time talking. Two young white men had a conversation about travel that lasted for hours. Where do you eat in São Paulo? Where do you stay in Bangkok? Once you've bought a ticket to France, how far can you stretch the money you have left once you get there?

An hour into the wait, someone dropped a stack of New York Posts and a couple copies of the New York Times on the table at the front and there was a rush for them. As people finished the papers, they swapped. Eileen talked to someone who'd gotten a Times, and negotiated twenty minutes to read it. We learned Israel had made an airstrike in Syria. Genro and I talked the situation over with a beautiful, chic young black man sitting next to us. As the conversation moved on he told us about the Bowery Residents Committee shelter, on 25th Street. Apparently it's a nice modern-looking place with relaxed rules. He stayed there a few months before "losing his bed," whatever that means. He told a story about sharing a room with a heroin addict and a giant ex-con. The addict came home confused one night. He leaned slowly over the sleeping ex-con, farther and farther, and pissed on him. The ex-con woke up, felt the piss on his face, and beat down the addict. Good times.

A guy I recognized from my last retreat was still there, a big black man with curly hair and a beard. He'd be played by Forrest Whitaker. Last time, he made the lunch wait hard by hunching in his chair and screaming rhythmically all morning. "Give it a rest!" people shouted at him. This time his tic was kind of fun: he'd periodically walk up to someone, touch his throat and sing a few jazzy syllables, nod gravely, and walk away. It was a way to say "hello" to the newcomers: over the course of the morning he seemed particularly to connect with members of our group this way.

Around 1pm the famous meal was served: meatloaf, meatballs in tomato sauce, salad, rolls, coffee. We had real table service. Volunteers walked around refilling our coffee and offering seconds. I make no claim to connoisseurship, but the meatloaf was pretty good.

I was tired of being cold at night, and regretting that I hadn't asked for a coat at the Mission the day before. I asked a volunteer if there was a coat I could have. She was a white woman named Fatima, dressed in an abaya, with a headscarf that covered her piled dreadlocks. On the back of the abaya she had sewn a black Occupy Wall Street patch that covered the width of her back. Her hood left visible a few inches of forehead above her eyes, which was tattooed with green lines in a North African style. She asked whether I wanted a blanket or a coat. What kind of coat would I prefer? I just said, "Whatever's warmest."

Fatima was in the donations room for ten minutes trying to find me the perfect coat. She came out with a green nylon pullover. "Look," she said, "this has a fleece lining, and deep pockets you can put stuff in." She held up the coat and turned it so I could see the lining and the pockets. "You can wear it zipped, or unzipped, and you can tie it around your waist during the day." She demonstrated tying it around her waist. "This should keep you warm tonight." She held up her hand for a high-five.

She gave me the schedule for some homeless advocacy group and insisted I show up. She seemed very concerned about me, and perhaps also she saw in me a possible advocate for the homeless. I was uncomfortable. This had gone on long past the point where I should have explained about the retreat, but it seemed too late to suddenly come clean. This was the only time on the retreat that I misjudged the balance between being honest and blending in.

I rejoined the group with my fancy pullover and we walked up to Tompkins Square Park for naps and council.

Ari, Tompkins Square Park

Roshi Genro, Tompkins Square Park

Dylan, Tompkins Square Park

Batman, Tompkins Square Park

Dinner at the Bowery Mission again: this time no music or preaching, and the kitchen moved sluggishly. We are professionals at waiting by now. We sat in the pews for an hour talking to the other guests or listening to their conversations. A man behind me asked his friend to take a look at his eye. "Oh, you definitely have pinkeye. Don't do that! That. Don't rub it, you'll make it spread."

Their conversation moved on to the death of a rapper from Kris Kross. "He was doing coke and heroin together. What do they call that?" "Speedball." "Do you snort it or shoot it?" "Either way, man. Shooting it's dangerous, though."

I forget what was the main course for dinner. I had sickly-sweet iced tea, salad, and an apricot.

We slept in the entryway to Hebrew Union College again. This final night was the coldest yet. I was much better equipped than at the start of the retreat, with my new fleece from Fatima, on top of my leather jacket, and three layers beneath that, but I was still shivering too hard to sleep more than a few hours. But even so: I'd lost any fear of being cold. I was prepared to lie down and shiver until the morning, so that's what I did.

Sunday

I woke up from the cold. I'd learned how the bird songs mark the stages of dawn. When I heard the kind of bird that starts calling around 6, I stopped trying to sleep and got up. There was a party going on around the heating vent: Others had been up since 4. Batman had coffee for us from one of his contacts somewhere—he was acting mysterious about where it came from.

The others had received a box with more cups of coffee, plus cupcakes and a whole cucumber. Some street fair had ended in the late night or early morning, and the organizers had gone around in a van, looking to give their leftovers to some homeless people. Who they found was us.

We stood around the vent talking until 10, drinking coffee, smoking butts we found on the street. What a lovely way to spend a morning.

Mercer Street

Kristi and Marco

Sam

Dylan and Genro

Eileen

We ditched our carpets, with gratitude for their service to us, and gratitude that we needn't lug them any farther. We skipped breakfast and had a long and emotional closing council in Washington Square Park.

Wasp's Nest: The Read-Copy-Update Pattern In Python

In recent work on PyMongo, I used a concurrency-control pattern that solves a variety of reader-writer problem without mutexes. It's similar to the read-copy-update technique used extensively in the Linux kernel. I'm dubbing it the [...]

In recent work on PyMongo, I used a concurrency-control pattern that solves a variety of reader-writer problem without mutexes. It's similar to the read-copy-update technique used extensively in the Linux kernel. I'm dubbing it the Wasp's Nest. Stick with me—by the end of this post you'll know a neat concurrency pattern, and have a good understanding of how PyMongo handles replica set failovers.

Update: In this post's first version I didn't know how close my code is to "ready-copy-update". Robert Moore schooled me in the comments. I also named it "a lock-free concurrency pattern" and Steve Baptiste pointed out that I was using the term wrong. My algorithm merely solves a race condition without adding a mutex, it's not lock-free. I love this about blogging: in exchange for a little humility I get a serious education.


Paper Wasp © MzePhotos.com, Some Rights Reserved

The Mission

MongoDB is deployed in "replica sets" of identical database servers. A replica set has one primary server and several read-only secondary servers. Over time a replica set's state can change. For example, if the primary's cooling fans fail and it bursts into flames, a secondary takes over as primary a few seconds later. Or a sysadmin can add another server to the set, and once it's synced up it becomes a new secondary.

I help maintain PyMongo, the Python driver for MongoDB. Its MongoReplicaSetClient is charged with connecting to the members of a set and knowing when the set changes state. Replica sets and PyMongo must avoid any single points of failure in the face of unreliable servers and networks—we must never assume any particular members of the set are available.

Consider this very simplified sketch of a MongoReplicaSetClient:

class Member(object):
    """Represents one server in the set."""
    def __init__(self, pool):
        # The connection pool.
        self.pool = pool

class MongoReplicaSetClient(object):
    def __init__(self, seeds):
        self.primary = None
        self.members = {}
        self.refresh()

        # The monitor calls refresh() every 30 sec.
        self.monitor = MonitorThread(self)

    def refresh(self):
        # If we're already connected, use our list of known
        # members. Otherwise use the passed-in list of
        # possible members, the 'seeds'.
        seeds = self.members.keys() or self.seeds

        # Try seeds until first success.
        ismaster_response = None
        for seed in seeds:
            try:
                # The 'ismaster' command gets info
                # about the whole set.
                ismaster_response = call_ismaster(seed)
                break
            except socket.error:
                # Host down / unresolvable, try the next.
                pass

        if not ismaster_response:
            raise ConnectionFailure()

        # Now we can discover the whole replica set.
        for host in ismaster_response['hosts']:
            pool = ConnectionPool(host)
            member = Member(pool)
            self.members[host] = member

        # Remove down members from dict.
        for host in self.members.keys():
            if host not in ismaster_response['hosts']:
                self.members.pop(host)

        self.primary = ismaster_response.get('primary')

    def send_message(self, message):
        # Send an 'insert', 'update', or 'delete'
        # message to the primary.
        if not self.primary:
            self.refresh()

        member = self.members[self.primary]
        pool = member.pool
        try:
            send_message_with_pool(message, pool)
        except socket.error:
            self.primary = None
            raise AutoReconnect()

We don't know which members will be available when our application starts, so we pass a "seed list" of hostnames to the MongoReplicaSetClient. In refresh, the client tries them all until it can connect to one and run the isMaster command, which returns information about all the members in the replica set. The client then makes a connection-pool for each member and records which one is the primary.

Once refresh finishes, the client starts a MonitorThread which calls refresh again every 30 seconds. This ensures that if we add a secondary to the set it will be discovered soon and participate in load-balancing. If a secondary goes down, refresh removes it from self.members. In send_message, if we discover the primary's down, we raise an error and clear self.primary so we'll call refresh the next time send_message runs.

The Bugs

PyMongo 2.1 through 2.5 had two classes of concurrency bugs: race conditions and thundering herds.

The race condition is easy to see. Look at the expression self.members[self.primary] in send_message. If the monitor thread runs refresh and pops a member from self.members while an application thread is executing the dictionary lookup, the latter could get a KeyError. Indeed, that is exactly the bug report we received that prompted my whole investigation and this blog post.

The other bug causes a big waste of effort. Let's say the primary server bursts into flames. The client gets a socket error and clears self.primary. Then a bunch of application threads all call send_message at once. They all find that self.primary is None, and all call refresh. This is a duplication of work that only one thread need do. Depending how many processes and threads we have, it has the potential to create a connection storm in our replica set as a bunch of heavily-loaded applications lurch to the new primary. It also compounds the race condition because many threads are all modifying the shared state. I'm calling this duplicated work a thundering herd problem, although the official definition of thundering herd is a bit different.

Fixing With A Mutex

We know how to fix race conditions: let's add a mutex! We could lock around the whole body of refresh, and lock around the expression self.members[self.primary] in send_message. No thread sees members and primary in a half-updated state.

...and why it's not ideal

This solution has two problems. The first is minor: the slight cost of acquiring and releasing a lock for every message sent to MongoDB, especially since it means only one thread can run that section of send_message at a time. A reader-writer lock alleviates the contention by allowing many threads to run send_message as long as no thread is running refresh, in exchange for greater complexity and cost for the single-threaded case.

The worse problem is the behavior such a mutex would cause in a very heavily multithreaded application. While one thread is running refresh, all threads running send_message will queue on the mutex. If the load is heavy enough our application could fail while waiting for refresh, or could overwhelm MongoDB once they're all simultaneously unblocked. Better under most circumstances for send_message to fail fast, saying "I don't know who the primary is, and I'm not going to wait for refresh to tell me." Failing fast raises more errors but keeps the queues small.

The Wasp's Nest Pattern

There's a better way, one that requires no locks, is less error-prone, and fixes the thundering-herd problem too. Here's what I did for PyMongo 2.5.1, which we'll release next week.

First, all information about the replica set's state is pulled out of MongoReplicaSetClient and put into an RSState object:

class RSState(object):
    def __init__(self, members, primary):
        self.members = members
        self.primary = primary

MongoReplicaSetClient gets one RSState instance that it puts in self.rsstate. This instance is immutable: no thread is allowed to change the contents, only to make a modified copy. So if the primary goes down, refresh doesn't just set primary to None and pop its hostname from the members dict. Instead, it makes a deep copy of the RSState, and updates the copy. Finally, it replaces the old self.rsstate with the new one.

Each of the RSState's attributes must be immutable and cloneable, too, which requires a very different mindset. For example, I'd been tracking each member's ping time using a 5-sample moving average and updating it with a new sample like so:

class Member(object):
    def add_sample(self, ping_time):
        self.samples = self.samples[-4:]
        self.samples.append(ping_time)
        self.avg_ping = sum(self.samples) / len(self.samples)

But if Member is immutable, then adding a sample means cloning the whole Member and updating it. Like this:

class Member(object):
    def clone_with_sample(self, ping_time):
        # Make a new copy of 'samples'
        samples = self.samples[-4:] + [ping_time]
        return Member(samples)

Any method that needs to access self.rsstate more than once must protect itself against the state being replaced concurrently. It has to make a local copy of the reference. So the racy expression in send_message becomes:

rsstate = self.rsstate  # Copy reference.
member = rsstate.members[rsstate.primary]

Since the rsstate cannot be modified by another thread, send_message knows its local reference to the state is safe to read.

A few summers ago I was on a Zen retreat in a rural house. We had paper wasps building nests under the eaves. The wasps make their paper from a combination of chewed-up plant fiber and saliva. The nest hangs from a single skinny petiole. It's precarious, but it seems to protect the nest from ants who want to crawl in and eat the larvae. The queen periodically spreads an ant-repellant secretion around the petiole; its slenderness conserves her ant-repellant, and concentrates it in a small area.

Wasp's Nest in Situ [Source]

I think of the RSState like a wasp's nest: it's an intricate structure hanging off the MongoReplicaSetClient by a single attribute, self.rsstate. The slenderness of the connection protects send_message from race conditions, just as the thin petiole protects the nest from ants.

Since I was fixing the race condition I fixed the thundering herd as well. Only one thread should run refresh after a primary goes down, and all other threads should benefit from its labor. I nominated the monitor to be that one thread:

class MonitorThread(threading.Thread):
    def __init__(self, client):
        threading.Thread.__init__(self)
        self.client = weakref.proxy(client)
        self.event = threading.Event()
        self.refreshed = threading.Event()

    def schedule_refresh(self):
        """Refresh immediately."""
        self.refreshed.clear()
        self.event.set()

    def wait_for_refresh(self, timeout_seconds):
        """Block until refresh completes."""
        self.refreshed.wait(timeout_seconds)

    def run(self):
        while True:
            self.event.wait(timeout=30)
            self.event.clear()

            try:
                try:
                    self.client.refresh()
                finally:
                    self.refreshed.set()
            except AutoReconnect:
                pass
            except:
                # Client was garbage-collected.
                break

(The weakref proxy prevents a reference cycle and lets the thread die when the client is deleted. The weird try-finally syntax is necessary in Python 2.4.)

The monitor normally wakes every 30 seconds to notice changes in the set, like a new secondary being added. If send_message discovers that the primary is gone, it wakes the monitor early by signaling the event it's waiting on:

rsstate = self.rsstate
if not rsstate.primary:
    self.monitor.schedule_refresh()
    raise AutoReconnect()

No matter how many threads call schedule_refresh, the work is only done once.

Any MongoReplicaSetClient method that needs to block on refresh can wait for the "refreshed" event:

rsstate = self.rsstate
if not rsstate.primary:
    self.monitor.schedule_refresh()
    self.monitor.wait_for_refresh(timeout_seconds=5)

# Get the new state.
rsstate = self.rsstate
if not rsstate.primary:
    raise AutoReconnect()

# Proceed normally....

This pattern mitigates the connection storm from a heavily-loaded application discovering that the primary has changed: only the monitor thread goes looking for the new primary. The others can abort or wait.

The wasp's nest pattern is a simple and high-performance solution to some varieties of reader-writer problem. Compared to mutexes it's easy to understand, and most importantly it's easy to program correctly. For further reading see my notes in the source code.

Paper wasp and nest [Source]

Another Thing About Python's Threadlocals

As the maintainer of the connection pool for PyMongo, the official MongoDB driver for Python, I've gotten far more intimate knowledge of Python threads than I'd ever wanted. One of the challenges I face is: if the connect pool assigns a [...]

Dammit

As the maintainer of the connection pool for PyMongo, the official MongoDB driver for Python, I've gotten far more intimate knowledge of Python threads than I'd ever wanted.

One of the challenges I face is: if the connect pool assigns a socket to a thread and the thread dies, how do we reclaim the socket for the general pool? I thought I nailed it last year, using a weakref callback to a threadlocal, but there's a bug in that method. Justin Patrin of Idle Games discovered it while testing a PyMongo contribution he's making. I'm going to describe the bug, its impact, the cause, and the fix. I'll conclude by kvetching about supporting archaic versions of Python.

The Bug

Here's some code to start 1000 threads and register to be notified when they're kaput. At the end I assert no thread has died unmourned:

import threading
import weakref

nthreads = 10000
ncallbacks = 0
ncallbacks_lock = threading.Lock()
local = threading.local()
refs = set()

class Vigil(object):
    pass

def run():
    def on_thread_died(ref):
        global ncallbacks
        ncallbacks_lock.acquire()
        ncallbacks += 1
        ncallbacks_lock.release()

    vigil = Vigil()
    local.vigil = vigil
    refs.add(weakref.ref(vigil, on_thread_died))

threads = [threading.Thread(target=run)
           for _ in range(nthreads)]
for t in threads: t.start()
for t in threads: t.join()
getattr(local, 'c', None)  # Trigger cleanup in <= 2.7.0
assert ncallbacks == nthreads, \
    'only %d callbacks run' % ncallbacks

This is the method I presented in "Knowing When A Python Thread Has Died". Each thread creates a "vigil" object and sticks it in a threadlocal. Since only the threadlocal refers to the vigil, the vigil should be destroyed when the thread dies. I make a weakref to the vigil and register a weakref callback. If all goes well, the callback is run as the thread dies. A quirk of Python 2.7.0 or lesser is that the callback is run when the next thread accesses the threadlocal. This oddity is a consequence of Python Issue 1868, fixed by Antoine Pitrou in late 2010 and released in Python 2.7.1.

Note also that I synchronize ncallbacks += 1 with a mutex, since += is not atomic in Python. This innocent-looking mutex harbors a dark intent, as we shall soon discover.

In Python 2.7.1 and newer, the code above works as expected: ncallbacks is equal to 1000 immediately after all the threads are joined. In Python 2.7.0, ncallbacks should be 999 after the threads are joined, and then 1000 after the main thread does the final getattr to trigger cleanup.

The bug is: In Python 2.7.0 and older, ncallbacks is sometimes a few callbacks shy of a thousand. A few threads have been buried in unmarked graves....

Its Impact

I found that an application running Python 2.7.0 or older, if it creates and destroys very large numbers of threads continuously for a long time, and if each thread calls end_request at least once and start_request more times than end_request, will occasionally leave a socket tied to a dead thread. These sockets will eventually exceed the process's ulimit or MongoDB's.

This application pattern would be as weird and unusual as it sounds, which is why no one's complained of the bug.

The Fix

Once I'd written the test code above, I spent a few hours futzing with it—Dammit, I thought this worked! I tried various techniques to force Python 2.7.0 to run the callback a thousand times reliably. Late in the day a divine voice intoned, "synchronize assignment to the threadlocal." So I added a lock:

local_lock = threading.Lock()
# ...
    vigil = Vigil()
    local_lock.acquire()
    local.vigil = vigil
    local_lock.release()
    refs.add(weakref.ref(vigil, on_thread_died))

It worked! Now I was angrier. How can assigning to a threadlocal not be thread-safe?

The Cause

Let's again consider the example code above. The bytecode for assigning vigil to local.vigil is:

28 LOAD_FAST        1 (vigil)
31 LOAD_GLOBAL      3 (local)
34 STORE_ATTR       4 (vigil)

STORE_ATTR calls PyObject_SetAttr, which calls local_setattro, defined in Modules/threadmodule.c:

static int
local_setattro(localobject *self, PyObject *name, PyObject *v)
{
    PyObject *ldict;

    ldict = _ldict(self);
    if (ldict == NULL)
        return -1;

    return PyObject_GenericSetAttr((PyObject *)self, name, v);
}

At the highlighted line it calls _ldict. The _ldict function is, as I've known for some time, a pathetic piece of poo in Python 2.7.0 and older. Here's the turd, edited down a bit:

static PyObject *
_ldict(localobject *self)
{
    PyObject *tdict, *ldict;

    tdict = PyThreadState_GetDict();
    ldict = PyDict_GetItem(tdict, self->key);
    if (ldict == NULL) {
        ldict = PyDict_New(); /* we own ldict */

        PyDict_SetItem(tdict, self->key, ldict);
        Py_DECREF(ldict); /* now ldict is borrowed */
        if (i < 0)
            return NULL;

        Py_CLEAR(self->dict);
        Py_INCREF(ldict);
        self->dict = ldict; /* still borrowed */
    }

    /* The call to tp_init above may have caused
       another thread to run.
       Install our ldict again. */
    if (self->dict != ldict) {
        Py_CLEAR(self->dict);
        Py_INCREF(ldict);
        self->dict = ldict;
    }

    return ldict;
}

We haven't seen any use of the Py_BEGIN_ALLOW_THREADS macro, so one thread's had the GIL the whole time. Locking around the assignment shouldn't have any effect, right?

Well, take a look at the highlighted Py_CLEAR(self->dict) statement—there's the perpetrator. That statement gets the ldict of the last thread that accessed this threadlocal, swaps it with NULL and decrefs it. If this is the last reference to ldict (because the last thread has died) then decref'ing destroys it, and the weakref callback to vigil runs. The callback does ncallbacks_lock.acquire, which releases the GIL before trying to get the mutex.

So here's the kind of scenario I prevented by locking around assignment to the threadlocal:

  1. Thread A starts, assigns to the threadlocal, dies.
  2. Thread A's ldict is now the threadlocal's self->dict and has a refcount of 1.
  3. Thread B starts, begins assigning to the threadlocal, enters the _ldict function.
  4. _ldict sets self->dict to NULL and decrefs Thread A's ldict, which runs on_thread_died, which calls ncallbacks_lock.acquire and releases the GIL.
  5. Now Thread C starts, begins assigning to the threadlocal, enters _ldict.
  6. Thread C finds self->dict is NULL, increfs its own local ldict and assigns it to self->dict. It exits _ldict.
  7. Thread B resumes at Py_CLEAR(self->dict), increfs its own ldict and assigns it to self->dict.

Thread B has now replaced a pointer to Thread C's ldict with a pointer to its own, but it didn't decref Thread C's ldict first. (_ldict wasn't written to survive interruption during Py_CLEAR.) Thread C's ldict will never be destroyed, and a weakref callback to its vigil attribute will never be called.

Locking around assignment to the threadlocal prevents _ldict from running concurrently for any one threadlocal object, and prevents the refleak. In Python 2.7.1 and newer, the whole misguided self->dict system is removed from threadlocals and the lock's not needed.

This scenario applies to PyMongo's connection pool because the pool does need to acquire a lock in its weakref callback. Even if it didn't, there's a possibility of interruption whenever a thread is running Python code.

A Kvetch

This testing, the bug it revealed, the investigation, the fix: all this effort was spent to support entirely obsolete versions of Python. The Python core developers stopped maintaining them years ago, but PyMongo supports all Pythons going back to 2.4, mainly because there are "long-term support" Linux distros like Ubuntu and RHEL that once shipped with them. I have very savvy friends writing new applications on Python 2.6. Our children will have flying cars before we're done debugging these steam-powered versions of Python.

It's particularly frustrating because there's no point even filing bugs against Pythons before 2.7. "We fixed it," the developers will reply. "Upgrade." In Python 2.6, no one can hear you scream.

April Street Portraits

I plan to continue my portrait project at transitional housing facilities. But scheduling those shoots is slow. Meanwhile, I need new pictures for the classes I'm taking, so I photographed some strangers in the East Village. I notice [...]

I plan to continue my portrait project at transitional housing facilities. But scheduling those shoots is slow. Meanwhile, I need new pictures for the classes I'm taking, so I photographed some strangers in the East Village.

April 2013 street portraits 2

April 2013 street portraits 1

April 2013 street portraits 4

April 2013 street portraits 3

I notice more than ever, in this set, how much I'm influenced by Hiroh Kikai's Asakusa Portraits. Of course I'm not a fraction of the photographer he is. But like the poet Kenneth Koch said, I "like to be influenced."

Moraff's World

A long-quiescent memory got knocked loose. I recalled that I'd played Moraff's World obsessively as a kid, sneaking out of bed at night to play it on my mother's Tandy 3000. So I downloaded the game and played it for a few hours this week in a [...]

A long-quiescent memory got knocked loose. I recalled that I'd played Moraff's World obsessively as a kid, sneaking out of bed at night to play it on my mother's Tandy 3000. So I downloaded the game and played it for a few hours this week in a DOS emulator.

Moraff's World is a fantasy role-playing game from 1991. It has the usual mechanics of the genre: You choose a race and a class like Fighter or Wizard, explore dungeons, and gain money and items by killing monsters. But Moraff's World is distinguished by its insane complexity. Characters can be one of eight races and seven classes. Killed monsters drop money in seven currencies. There are over a hundred distinct spells, and they come in books, scrolls, wands, and papers. Some characters can learn wizardly or priestly spells, some can learn both. The main UI looks like the bridge of a nuclear submarine:

Moraff dungeon

You look in all four directions at once. There is also an overhead map, like this section of town:

Moraff town

It is the player's job to memorize that yellow squares are temples, red are inns, and so forth.

In a modern role-playing game like Diablo, the town feels alive: music plays, rain patters down, random characters walk around and talk. Moraff's town is vacant and still.

Moraff town 2

It is not characters who speak to you in Moraff's World, but the programmer Steve Moraff himself. When you enter a bank, the options include PRESS 4 TO ROB BANK. Do so, and the game replies, COME ON! DO YOU REALLY THINK I'D LET YOU ROB MY OWN BANK? PRESS ANY KEY TO CONTINUE.... You are not immersed. You are explicitly in a game designed by a single programmer.

The experience does not resemble a fantasy movie like Lord of the Rings as much as it does reading a fantasy book. When I see a balrog in the movie, I see its fiery skull and its whip, pretty much the same as other viewers. How different from reading Tolkien's description: "a great shadow, in the middle of which was a dark form, of man-shape maybe, yet greater; and a power and terror seemed to be in it and to go before it." I make my own balrog from these words. In the same way, in the absence of any sound or animation, I supply the missing life to Moraff's static world.

This is not to say that the game lacks charm. It is incredibly idiosyncratic. The monsters seem drawn in MS Paint by an exuberant child. Consider this werewolf, and what appears to be a Hawaiian zombie:

Moraff monsters

The game's engineering is as primitive as its art. Graphics are drawn in layers with the Painter's Algorithm. With each step your character takes, the views in all four directions are re-rendered. The walls are slowly drawn on screen from farthest to nearest. Even on my modern laptop this can take some time when looking down a long hallway. But the technique lends itself to fun effects, like the translucent Shadow Minidragon, partially drawn over the walls behind him:

Moraff minidragon

Moraff applied a similar method to the Wilderness. You may climb a ladder out of town to reach this randomly-generated landscape. It takes several seconds to calculate. (It took several minutes on my mother's Tandy.) If you hit H for Help, Moraff tells you that there's no point exploring the wilderness. It only leads to other dungeons which are all the same. It's a sightseeing expedition.

Moraff wilderness

When I downloaded it this week, my first impression of this Shareware-era game was nostalgic. Back then, games were envisioned by a few people or, in the case of Moraff, one programmer-auteur. There was room for a folk genius to succeed with a very weird game. Nowadays he'd be drowned out by games with hundred-million-dollar budgets like Grand Theft Auto V.

But on second thought, the industry is simply more mature, with a bigger audience and a broader range of games. There's still a place for avant-garde titles developed by small teams, like Braid or Sword & Sworcery. Both use the vocabulary of the simple video games we played as children to evoke grown-up ideas.

Braid is a platform-jumping game, explicitly an homage to Super Mario Brothers. There's even a princess to rescue. But the protagonist has mysterious powers to slow or reverse time, and the game asks: if you had these powers, what would you be? Does the princess want to be rescued? Are you playing the good guy or not?

Braid

In the iPad game Sword & Sworcery, the graphics are deliberately archaic and pixellated, but the themes are innovative. The game makes surprising demands regarding its place in the player's life. After you beat a level, for example, Sword & Sworcery pauses for a minute and suggests you take a break and do something else. There are levels that can only be played near a full moon, or a new moon. I changed my iPad's date so I could play them immediately. The designers' goal—to make me aware of my addictive game-love—was accomplished.

Sword and Sworcery

The other striking idea of Sword & Sworcery is that one's character does not level up. Instead, with each victory she is weakened. She must keep fighting the same monsters but they grow tougher as the protagonist becomes more vulnerable. At the end, she beats the final boss, but she is retching blood, and flings herself into the river to die. By comparison, role-playing games where your character gains godlike powers seem like childish wish-fulfillment. If you were really a warrior come to save a town, this would be how you'd end up.

The Green Matrix

For a year and a half I've been part of the team maintaining PyMongo, the Python MongoDB driver. It's one of the most widely used Python packages with 1.5 million lifetime downloads. The code itself is only moderately complex; about 8300 [...]

For a year and a half I've been part of the team maintaining PyMongo, the Python MongoDB driver. It's one of the most widely used Python packages with 1.5 million lifetime downloads. The code itself is only moderately complex; about 8300 source lines. What makes it a tiny horror to work on is the range of environments we support. Here's our test matrix in Jenkins:

PyMongo test matrix

That's 72 test configurations. (It looks like more than that, but we don't test Jython and PyPy with C extensions compiled since that currently doesn't make sense.) The dimensions are:

  • Python version: We support CPython 2.4 through 3.3. On each commit we test just the highlight versions: 2.4, 2.7, and 3.3. We also support the latest Jython and PyPy. We test the intermediate versions like 2.5 and 2.6 before a release.

  • C extensions: we have a few key parts of PyMongo implemented in C for speed, with pure-Python versions as a fallback. We test both modes.

  • MongoDB Version: We test the latest development branch of MongoDB (2.5) plus the last two production versions.

  • MongoDB Configuration: We set up a single server, a master-slave pair, and a three-node replica set, and run mostly the same tests against all.

In each test configuration, PyMongo's test suite has about 430 individual test functions.

This covers the main test matrix, but there are some auxiliary tests we run in Jenkins on every commit. We have a mod_wsgi test that runs a few thousand web requests (first serial, then parallel) against a web app using mod_wsgi in a range of configurations:

  • Python 2.4, 2.5, 2.6, and 2.7

  • mod_wsgi 2.8, 3.2, and 3.3

  • The latest production MongoDB as a single server or replica set

The mod_wsgi tests are there to ensure we never recreate a connection leak like the apocalyptic "unbounded connection growth with Apache mod_wsgi 2.x" bug to which I lost some of the best weeks of my life.

I've also set up some tests for Motor, my non-blocking MongoDB driver for Tornado: I run in Python 2.6, 2.7, and 3.3 against a single MongoDB server and a replica set, running the three most recent versions of MongoDB. I have a separate Motor test that connects to MongoDB over SSL, and finally I have a test of "Synchro," which wraps Motor inside a resynchronization layer and checks it can pass all the same tests as PyMongo. In all, Jenkins runs 33 test configurations for each Motor commit.

Jenkins automatically tests our main configurations, but we periodically hand-test some additional configurations, like sharded clusters, beta releases of Jython and PyPy, and Windows. We'll put some of these in Jenkins too.

For a team of three people to build and maintain this volume of test infrastructure is a huge effort. It's clearly worth it, because the test matrix is so large. But it's not much fun.

Lessons learned:

  • Test code is a liability: Too much testing code is as bad as too much of any other kind of code. Write as few tests as possible to cover the cases you need to test. Over-testing comforts the novice but impedes agility. For example, when we renamed PyMongo's Connection class to MongoClient, I had to change over 1000 lines in 32 files in the test suite. A commit that huge is a barrier in the repository's history, across which no commit can be moved without conflicts. I hope to never do anything like it again. The test suite should be smaller and better factored.

  • Tests must be very reliable: It needs to be not only minimal but also very reliable. Tests should fail if and only if the behavior they test breaks. When I joined the team, PyMongo's tests often failed "just cuz." Fixing them all took months: We'd observe an intermittent failure in Jenkins due to some race condition that we couldn't reproduce on our laptops (an EC2 "medium" instance runs a three-node MongoDB cluster slower than you could possibly imagine). We'd think real hard and finally understand and fix the failure. Then we'd do the same for some other test. It was a costly exercise but necessary: It's not until our tests always passed that we took them seriously when they didn't.

There are other dicta that I find negotiable: tests should be fast, sure, but I can live with a test suite that takes a few minutes to run per configuration. Perhaps test methods should include only one assert, but I can live with several asserts in some methods.

I'm implacably opposed to mocking when it comes to testing PyMongo: what our tests verify is primarily our understanding of how to talk to MongoDB. If we mocked any aspect whatsoever of the MongoDB server, our tests would be worse than useless. Virtually every test of PyMongo is an integration test, so we make no distinction between "unit tests" and "integration tests."

I'm curious what others have learned from maintaining a driver's test suite. It seems to be a lot of hard work no matter what.

I Will Pick Up What Others Discard

My friend Jim Roberts emailed me this quote from Master Hua, a founder of Chan Buddhism in the West: Those in search of the Way should bear this in mind: "I will pick up what others discard." What others do not want, I want; what others will not [...]

My friend Jim Roberts emailed me this quote from Master Hua, a founder of Chan Buddhism in the West:

Those in search of the Way should bear this in mind: "I will pick up what others discard." What others do not want, I want; what others will not eat, I will eat; what others will not suffer, I will suffer; what others will not tolerate, I will tolerate; what others will not permit, I will permit; what others will not do, I will do. If you want to support others, you must do it from below. "Seeking the Way from a lower place" means starting from below, not standing up at the top of the mountain. You will never see the Way from the top of Mount Sumeru; but when you are at the very bottom of Mount Sumeru, there you will find the Way.

Jim says this reminds him of street retreat. Yes. Here's my friend Shōin collecting cans during a street retreat last year:

Shoin collecting cans

This non-rejecting mind, this mind of spiritual poverty, is the muscle we're training when we're on the street.

Slides From My Talk On Python Coroutines

Here's the slides from tonight's NYC Python Meetup talk on coroutines in Tornado and Tulip. The slides are a bit inscrutable on their own—it's my style to just show code, then talk a lot to explain the code. Still, if you were there [...]

Here's the slides from tonight's NYC Python Meetup talk on coroutines in Tornado and Tulip. The slides are a bit inscrutable on their own—it's my style to just show code, then talk a lot to explain the code. Still, if you were there tonight you may find these useful.

Python Coroutines, Present and Future from emptysquare