public marks

PUBLIC MARKS with tag data

This month

High Scalability - High Scalability - Tumblr Architecture - 15 Billion Page Views a Month and Harder to Scale than Twitter

by karlcow

Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers.

distributed… owning our data… etc. *sigh*

January 2012

The Emperor's New Client

by karlcow

It's funny how you don't hear so much about service mashups these days, despite their undeniable coolness. I'll assert that it's because developing for Web data in the browser is bloody hard work, especially when there are NxN arbitrary API mappings to know.

December 2011

AntiMap Log | AntiMap

by karlcow

AntiMap Log is a smart phone utility application for ‘recording’ your own data. Whether your out snowboarding, skiing, mountain biking, driving, running, or whatever your into, AntiMap Log is a DIY solution for gathering real-time stats with your phone. The indexed data can then be used in conjunction with any of the free AntiMap post analysis applications (or your own creations) to visualize your every move.

November 2011

October 2011

September 2011

IM2GPS: estimating geographic information from a single image

by karlcow

Estimating geographic information from an image is an excellent, difficult high-level computer vision problem whose time has come. The emergence of vast amounts of geographically-calibrated image data is a great reason for computer vision to start looking globally — on the scale of the entire planet! In this paper, we propose a simple algorithm for estimating a distribution over geographic locations from a single image using a purely data-driven scene matching approach. For this task, we will leverage a dataset of over 6 million GPS-tagged images from the Internet. We represent the estimated image location as a probability distribution over the Earth's surface. We quantitatively evaluate our approach in several geolocation tasks and demonstrate encouraging performance (up to 30 times better than chance). We show that geolocation estimates can provide the basis for numerous other image understanding tasks such as population density estimation, land cover estimation or urban/rural classification.

August 2011

pandas: a python data analysis library — pandas v0.4.0dev documentation

by karlcow

pandas is a python package providing convenient data structures for time series, cross-sectional, or any other form of “labeled” data, with tools for building statistical and econometric models.

July 2011

The Microsoft Update: Google explains its data correlation privacy settings

by karlcow

A company spokesperson told me that the search giant is not looking at public phone directories to match phone numbers with user names. But it is looking through social media sites to correlate those accounts to your Google user name and profile.

The Decline of the Online Message Board - NYTimes.com

by karlcow

By contrast, the Web 2.0 juggernauts like Facebook and YouTube are driven by metrics and supported by ads and data mining. They’re networks, and super-fast — but not communities, which are inefficient, emotive and comfortable. Facebook — with its clean lines and social expressways — is Robert Moses par excellence.

June 2011

Official Google Blog: Introducing schema.org: Search engines come together for a richer web

by karlcow

introduces schemas for more than a hundred new categories, including movies, music, organizations, TV shows, products, places and more.

ironic how Google is finally getting the way of the old Yahoo!

May 2011

April 2011

Atlas of the Habitual

by karlcow

If you had a visualization of every place you've been for 200 days, what could you do with it? What could it tell you about yourself and how could others use the data?

Technology allows us to see information in a way we never could before. Atlas of the Habitual is about creating data out of the everyday, the hyper-digitizing of your life.

Jdrop | Welcome to Jdrop

by karlcow & 1 other

Jdrop provides a place to store JSON data in the cloud.

The initial application is for storing performance data gathered from mobile devices.

It's hard to analyze large amounts of information (HTTP waterfall charts, HTTP headers, document source, etc.) on a mobile device.

Jdrop lets you gather this data on the mobile device but analyze it remotely on a larger screen.

Active users

karlcow
last mark : 14/02/2012 14:19

simo
last mark : 06/02/2012 13:33

RETFU
last mark : 04/01/2012 08:32

vrossign
last mark : 20/12/2011 19:48

mixher
last mark : 08/11/2011 16:09

Rik
last mark : 10/08/2011 09:31

HK
last mark : 06/06/2011 10:45

Krome
last mark : 01/06/2011 10:55

François Hodierne
last mark : 10/05/2011 07:06

innipukinn
last mark : 02/05/2011 10:16