Is my Dropbox going to sell me out?

Let us assume for a moment that I have NSFW content on my personal (paid for by me) Dropbox, which I connect to my work computer. 

Am I then “storing” NSFW content on my work computer? After all Dropbox locally caches pretty much everything. If I’m using Dropbox at work for things like 1Password or syncing application config, am I obligated to ensure that no NSFW content is part of the bargain?

(Note that I am not looking at the NSFW content at work. It is just passively there. Obviously if I use Dropbox to sync NSFW content so I can look at it at work, I’m in the wrong.)

Simpler, but vulgar

Allow me to make a stronger, simpler, and far more vulgar point than this post:

Hacker News is awful. The “community” is mostly bored shitheads posting from work, or massively self-important assholes who are bitter their TODO app didn’t get into YCombinator after 15 tries. 

Twitter accounts like Shit HN Says could be 10x as content-rich if they wanted to. Most of the things posted on HN are just truly awful, in terms of a community of ostensibly technical people. 

You can see craven, callow, shallow, mean, bitter, obnoxious, stupid … the entire range of awful human interactions. Honestly, when’s the last time a HN comment really improved anything? Really made a strong contribution to your job, your hobby? 

As a community-curated link site, Hacker News has good stuff most of the time. As a place for community discussion, Hacker News is just You Tube comments with better grammar.

MongoDB and FUD

The Problem:

Our data set consists of gigabytes upon gigabytes of pickled Python dictionaries, CSV files, and plain text, with the odd bit of Excel or Word. I have three goals:

  • Maintain this monstrosity
  • Create a searchable index
  • Build a new version for the future.

The entire app is a single monolithic Python app: there is no such thing as a “front end” or a “back end” or “middleware”. It’s a web app but there’s no templates; it generates HTML via print statements. The same Python file may include standalone logic,or  shared logic to be used by other components. It’s a bit of a mess. Lastly the framework it uses is basically abandonware; I haven’t tried to see if it runs under any Python after 2.4, and you can be sure it won’t work under 3.

My first task was the search problem. I started with Whoosh but after about a year, it started to run into performance problems, and I’d also learned enough about information retrieval that I wanted some more features. The Whoosh guy is awesome and he’s done a hell of a thing, though; I cannot recommend it enough for smaller projects, but I needed more. I’d attended a talk at Pycon about Elasticsearch, so I switched to that, and it’s been awesome. 

My strategy was pretty simple: a cron job to regenerate the world. Since Elasticsearch is really, really fast, it took perhaps 30 minutes to reindex the entire data set, and since it’s not a 24/7 use case running it at night is no big deal. (I’d like to provide real-time search but my users rarely need it; they’re content to have today’s new data appear tomorrow)

This worked so well for 2 reasons. First, I’d learned enough about the “common data set” that I could make the custom indexer pretty easy to work with since I knew enough about my users search needs that I could ignore 99.9% of the data. And second, Python dictionaries map really well to JSON, which Elasticsearch uses as its input and output.

In building the regenerate-the-world scripts, I had written a huge amount of code to 1)walk the entire flat-file “database” and 2)make lots and lots of sense of it all. I did stuff like, “ensure that every disparate part of the app always refers to a Project by the faux-primary-key ‘projectid’ instead of ‘pj’ and ‘projid’ and whatever else”. My indexer did a pretty decent job of cleaning up this semi-schemaless data; so now what?

Since our app uses CouchDB, it was my first choice, and very quickly abandoned. I loathe CouchDB. It makes a lot of sense in our app, but not for a general-purpose data store. 

Up next was “any ol’ RDBMS”, which means MySQL. Attempts to hammer the semi-schemaless data into relational format resulted in a data model so complex and byzantine, it was practically recursive. Instead of 3rd normal form I made a wormhole into a hell-dimension. So, no.

Despondent and generally upset, I tried MongoDB. And it worked! Experiments worked really well! 

  • As I said, Python dictionaries map very well to JSON/BSON so the amount of friction in import/export was minimal.
  • Ad-hoc queries
  • easy blob storage for stuff like Word documents
  • It’s fast (importing the world took perhaps 20 minutes)
  • It’s easy to set up (compile and go, basically)
  • Support for every language and platform I could think of
  • Has some replication capability in case I ever need it

I wasn’t really sure about a couple things, mainly backup-and-restore, but that was really my only concern, and the Mongo docs on the topic seemed straightforward enough; my users can tolerate an hour of downtime.

And now, the point of my little story: I think Mongo DB is picked on more than just about any platform save PHP. There is so much fear, uncertainty, and doubt spread about it, it’s started to leak into my world and freak me out.

Consider the most recent thing, the “randomly log stuff” bit in the Java driver. Places like /r/shittyprogramming were all over it with digital brickbats. Every thread was then a free-for-all of “here’s now MongoDB screwed me over/Here’s why MongoDB sucks” stories from all over the internets.

Panic set in. This data is mission-critical; while my users can tolerate small amounts of downtime and don’t need OTP-type features, it’s still mission-critical data. Have I fucked up royally here? Have I set myself up for epic fail? Or am I just giving in to the sort of FUD that pervades every goddamn internet discussion about any sort of technology? Let’s face it: people pile on and rarely are they anywhere nearly as awesome as they think they are. 

At this point I’m not entirely sure what to do. My thought was to return to the cold comfort of MySQL, using a Friendfeed-style schemaless system. It’s a huge orthogonal step but I’ve recovered horribly fucked MySQL databases after three-too-many bottles of Tequila, so it’s safe and well-understood. It puts the impetus on me to write the entire friggin’ access layer, but whatever. I know about Postgres and JSON, but I don’t know Postgres at all.

Am I giving in to FUD? Do I stay the course, trusting that my proven, real-world positives outweigh potential negatives?

World Design and “instant-only” magic

So to go into detail a little further on last night’s little thought, I keep thinking about fantasy world design with traditional D&D magic, and how utterly illogical they are.

My main point is that the presence of long-lasting magical effects would pretty much always result in something that is no longer a feudal society.

You can throw around the handwave of “but it’s FANTASY” all you want, but the fact is every society faces pretty much the same problems: food preservation, sickness, transportation, and war. Humans invented agriculture so as to allow food to be created literally while they were sleeping; medicine, because people get sick and die over the dumbest things, like a small infection; transportation, because sitting in one place eventually exhausts its local resources; and war, because fuck those other guys. The first 3 also serve to make war more efficient: moving troops and their food, and healing their injuries.

So your average fantasy world (as shipped by whatever edition of D&D is current) has spells for all those things, yet every game world is pretty much late feudal in structure. What total crap.

(SIDEBAR: Yes, I am aware of Eberron, which is I think the first mass-market not-dumb-feudal setting. Sadly I have yet to meet anyone who actually plays it.)

Anyway, I think the “fix” for this is to simply disallow any spell or effect that does anything that persists. It fixes every problem!

  1. No stupid “colleges of Magic”. Magic effects happen as a result of some unknown, unquantifiable process. There is no correspondence course to learn how to shoot lightning bolts from your butt.
  2. Great and mighty empires happen mostly as a result of conquest, not superior economics, which is what happens when you can preserve food, acquire resources, and keep your people healthy.
  3. Trap-laden holes in the earth are now essential means of guarding treasure, as you cannot simply ward them with impenetrable force fields.

And so on. The exact definition of “instant-only” is a little fuzzy at times: obviously “continual light” is out, but “cure disease”? In my mind, “cure disease” is not instant, but I suppose opinions vary.

A brief D&D nerd thought

What if “magic” was defined as only momentary violations of the laws of physics?

So one of the things that define D&D magic is things like enchantment of items, the “continual light” spell, and so on; things that persist. What if the world only allowed only momentary violation of physics, causality, whatever?

It seems like a small change on the surface, but it’s probably huge: it’s a world without “magic” swords, or whatever. Most fantasy game worlds are built on persistent magic.

Switching to Amazon Cloud Player

I’ve been using iTunes Match since beta, and I’ve mostly been happy with it. That is, until sometime around the most recent iTunes update cycle, when it started to become moody: flaky and unreliable on some days, fine on others. 

Taking this as a sign that it’s time to start putting effort into getting off the iTunes ecosystem, I have started using Amazon Cloud Player. 

The Good:

First and foremost, so far the best thing about it is it always works. I have not yet encountered flakiness, strangely disappearing songs that were in the playlist yesterday, interminable day-long application stalling, songs that simply refuse to download, songs that simply stop playing halfway through and then won’t play at all, or any other general weirdness. The experience is as close to perfect as our computers allow. Upload a song, and it’s there, all the time, as long as Amazon itself is up. Since that is the primary purpose of these music-in-the-cloud services, I have to say Cloud Player is pretty great.

Device support is good. The recently released iPad app is snappy and easy to use, as is the iPhone app. I don’t have Android but I assume that it’s just as good, esp. given that they sell an Android device. As far as I know, it doesn’t support Roku or other TV devices, but here’s hoping one day it will (especially since e.g. Roku supports Amazon Video; obviously you can’t stream on AppleTV.)

Although hardly a seamless experience, you can continue to use iTunes (or whatever) to play music you buy and download from Amazon. Basically: buy digital on Amazon, and configure your browser to automatically download and import music. Said music will appear automagically on your Cloud Player. This gives you a pretty good best-of-both-worlds experience.

The cloud player is 99% pure HTML, so it performs very well. I run an instance in Fluid and it uses barely a pretty average amount of CPU.

The Less than Good:

The experience is what you might call “bare bones”. If you can’t live without smart playlists and ratings, not to mention pervasive drag-and-drop, Cloud Player probably isn’t for you. Other than the device apps there is no iTunes analog; the player is entirely web-based. The Mozilla-based Songbird player seems a natural fit in terms of technology – it even has a plugin to browse the Amazon MP3 store. I assume Amazon wants to avoid the business of software, since it produces very little in the way of desktop apps, so that’s just my blue-sky dream.

Speaking of desktop apps, the uploader uses the execrable AIR, so it’s horrible on lots of platforms. It stalls, sputters, lies, and generally makes a mess of things. If you have a truly immense library fixing things might take a while; but the most common screw-ups are easily fixed with just a few clicks, and like I said, once they’re fixed they tend to stay fixed.

The match feature made a few mistakes, but so did iTunes, so that’s a wash.

About a third of the default presentation of the player UI is devoted to ads, cross-sell for the currently playing song/album/artist. You can make it go away with a little jQuery magic if it bugs you.

The Bottom Line:

If you need to escape the iTunes ecosystem, Cloud Player is a viable option, provided you properly calibrate your expectations. 

Thoughts on Komodo 8

So Komodo 8 is going to be released next week. I have mixed feelings.

First, there’s a lot of good: the visual refresh is much needed, and the update of the core components is even more helpful. Having applications based on Mozilla really need all the great stuff Mozilla has come out with lately.

So that’s awesome. It feels faster and lighter, and looks prettier. Moving language keywords into the snippets library is awesome.

What bugs me is that some things seemed almost like a no-brainer to add, and they’re not there.

  1. Node.js is sort of a big deal right now. No built-in node shell? Mozilla themselves use the hell out of Node so it’s not like Node is some sort of ‘enemy’ technology. And in any event, we’re on version 8 and there’s not even a Spidermonkey-based shell. I cannot be the only person that uses a lot of JS shell.
  2. I’m sort of shocked that there’s no real improvement to the source control stuff. A couple of basic commands, no real UI.
  3. Speaking of source control: I’m equally shocked that there’s no Github or Bitbucket integration. Imagine how awesome it would be to have your project push directly. The obvious retort – “Well, no one asked for it!” is just as perplexing.
  4. The macro/expansion API seems more-or-less the same. It’s already pretty comprehensive and effective but there’s lots of obvious gaps, and I’m shocked no one notices them (eg, you can’t do a bunch of stuff via the keybinding UI or via macros; you have to click).
  5. I’m amazed that anyone likes the minimap thing and I feel like it’s going to end up a giant “what were we thinking?” feature, but whatever. Opinions and assholes, I guess.

Worst of all, RC1 is still buggy as hell. Forum chatter indicates they’re closing bugs really fast now, but it gives me The Fear that it won’t really be usable for a few minor releases. As I write this, there’s a really annoying display/redraw bug that’s making me almost motion-sick. This is a release-candidate.