2011
Anne Kaneko's Blog: Radiation (1)
Borrowing some data from the source below this could be 2.1 mSv for the year to 11 March 2012. Well, the same as the UK average but well above the Japanese recommended limit.
The Britney Spears Problem - Stream Gauges
Suppose a pipeline delivers an endless sequence of nonnegative integers at a steady rate of one number every t time units. We want to build a device—call it a stream gauge—that intercepts the stream and displays answers to certain questions about the numbers.
2010
Mining of Massive Datasets
Chapter 1 Data Mining
Chapter 2 Large-Scale File Systems and Map-Reduce
Chapter 3 Finding Similar Items
Chapter 4 Mining Data Streams
Chapter 5 Link Analysis
Chapter 6 Frequent Itemsets
Chapter 7 Clustering
Chapter 8 Advertising on the Web
Chapter 9 Recommendation Systems
copenhagen wheel project
Réseaux sociaux, analyse et data mining
Tom Morris: 2010-02-22
karl15 [Moderator] Yesterday 07:54 PM16
Another small tracking system when you are using different computers and/or sending emails from different contexts.
It requires a burden though, sending yourself copy of your emails.
the email will contain a Received: field, that you can extract with ip address from where the mail has been sent.
You can imagine a process which put yourself in bcc and then delete the email. A bit hacky whacky but could work.
writing | ben fry » Taking the “vs.” out of Man & Machine
“contrary to traditional assumptions, the uniquely human faculty of reason (conscious, intelligent, rational thought) requires very little computation, but that the unconscious sensorimotor skills and instincts that we share with the animals require enormous computational resources”
2009
Textual Log Analysis using Python « Isotoma Blog
Now, having logs of the channel reaching many megabytes, I was curious as to the text statistics produced by this channel, who has what reading age, and how much they’ve talked in comparison to other people.
Searchable: Annotation-Driven Indexing and Searching with Lucene :: Drive-by Digressions
Searchable is a toolkit for Lucene that harnesses the power of annotations to specify what properties to index and how to treat them.
PhotoMaker
interesting how Placemaker reacts differently to a text with "Rouen" and "at Rouen"This small mashup uses YQL to combine the power of Flickr and Yahoo! Placemaker. Copy and paste some text into the textbox below and click the button — application will first geo-locate all places in the text, and then will try to find photographs, published under Creative Commons license, which were geotagged at or near found places.
MIT Media Lab: Reality Mining
Reality Mining defines the collection of machine-sensed environmental data pertaining to human social behavior. This new paradigm of data mining makes possible the modeling of conversation context, proximity sensing, and temporospatial location throughout large communities of individuals. Mobile phones (and similarly innocuous devices) are used for data collection, opening social network analysis to new methods of empirical stochastic modeling.
The original Reality Mining experiment is one of the largest mobile phone projects attempted in academia. Our research agenda takes advantage of the increasingly widespread use of mobile phones to provide insight into the dynamics of both individual and group behavior. By leveraging recent advances in machine learning we are building generative models that can be used to predict what a single user will do next, as well as model behavior of large organizations.
Official Google Blog: 30,000 new Google Apps business users at Valeo
This marks a significant moment for Google Apps, because Valeo has 30,000 Internet-connected employees, making this one of the largest enterprise deployments of Google Apps to date. Valeo is moving to the cloud
Bientôt, Google Suggests powered by Valeo. On n'aura même plus besoin de faire de la veille techno.
2008
wiki.dbpedia.org : About
Logiciel statistique et datamining









