|Filed Under:||Programming / Web Development|
|Posts on Regator:||27|
|Posts / Week:||0.1|
|Archived Since:||August 23, 2011|
The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health: The Blog-Health-o-Meter™ reads Wow. Crunchy numbers The average container ship can carry about 4,500 containers. This blog was viewed about 18,000 times in 2010. If each view were a shipping … Continue reading »
If you’re into RESTful stuff, no matter if you’re a researcher or practitioner, consider submitting a paper to our WWW2011 Workshop on RESTful Design (see the Call for Papers for more details on how to participate). I’m very happy to see the workshop taking place again this year, after the huge success we had last … Continue reading »
In our daily work with Government data such as statistics, geographical data, etc. we often deal with Comma-Separated Values (CSV) files. Now, they are really handy as they are easy to produce and to consume: almost any language and platform I came across so far has some support for parsing CSV files and I can … Continue reading »
This is an announcement and call for feedback. Over the past couple of days I’ve compiled a short review article where I look into NoSQL solutions and to what extent they can be used to process Linked Data. I’d like to extend and refine this article, but this only works if you share your experiences … Continue reading »
Where I discuss why linking your data to other data in the Web makes sense through a simple example. Continue reading »...
Where Michael discusses the efficiency of relational databases for certain problems and the effectiveness of NoSQL for big, messy data. Continue reading »...
Where I suggest that rather than to delete data on the Web, create a new version of it to prevent lossy data. Continue reading »...
This is the second post in the solving-tomorrow’s-problems-with-yesterday’s-tools series. In his seminal article If You Have Too Much Data, then “Good Enough” Is Good Enough Pat calls for a ‘new theory for data’ – I’d like to call this: networked data (meaning: consuming and manipulating distributed data on a Web-scale). Show More Summary
Tomorrow, on 8.8. is the International JSON day. Why? Because I say so! Is there a better way to say ‘thank you’ to a person who gave us so much – yeah, I’m talking about Doug Crockford – and to acknowledge how handy, useful and cool the piece of technology is, this person ‘discovered‘? From … Continue reading »
Where I'm reviewing support for encryption in the context of IaaS|PaaS|SaaS cloud service offerings as well as concerning Hadoop. While the motivation for encryption might differ, the primary question is if systems support this (transparently) or if developers are forced to code this in the application logic. Continue reading »
In situations where Hadoop is used in a shared setup we witness two competing forces: the user expects performance vs. the view of the cluster owner who aims to optimise throughput and maximise utilisation. In the post, Michael elaborates a bit on challenges and solutions on this topic. Continue reading »
You might have already heard that MapR, the leading provider of enterprise-grade Hadoop and friends, is launching its European operations. Guess what? I’m joining MapR Europe as of January 2013 in the role of Chief Data Engineer EMEA and will support our technical and sales teams throughout Europe. Pretty exciting times ahead! As an aside: … Continue reading »
Today’s question is: where are we regarding MapReduce/Hadoop in the cloud? That is, what are the offerings of Hadoop-as-a-Service or other hosted MapReduce implementations, currently? A year ago, InfoQ ran a story Hadoop-as-a-Service from Amazon, Cloudera, Microsoft and IBM which will serve us as a baseline here. Show More Summary
Last week was Halloween and of course we went trick-or-treating with our three kids which resulted in piles of sweets in the living room. Powered by the sugar, the kids would stay up late to count their harvest and while I was observing them at it, I was wondering if it possible to explain the … Continue reading »
As nicely pointed out by Ilya Katsov: Denormalization can be defined as the copying of the same data into multiple documents or tables in order to simplify/optimize query processing or to fit the user’s data into a particular data model. So, I was wondering, why is – in Ilya’s write-up – denormalization not considered to be … Continue reading »
The value of large-scale datasets – stemming from IoT sensors, end-user and business transactions, social networks, search engine logs, etc. – apparently lies in the patterns buried deep inside them. Being able to identify these patterns, analyzing them is vital. Be it for detecting fraud, determining a new customer segment or predicting a trend. As … Continue reading »
Imagine you search for a camera, say a Canon EOS 60D, and in addition to the usual search results you’re as well offered a choice of actions you can perform on it, for example share the result with a friend, write a review for the item or, why not directly buy it? Sounds far fetched? … Continue reading »
Two widely used data formats on the Web are CSV and JSON. In order to enable fine-grained access in an hypermedia-oriented fashion I’ve started to work on Tride, a mapping language that takes one or more CSV files as inputs and produces a set of (connected) JSON documents. In the 2 min demo video I … Continue reading »
On the one hand you have structured data sources such as relational DB, NoSQL datastores or OODBs and the like that allow you to query and manipulate data in a structured way. This typically involves schemata (either upfront with RDB or sort of dynamically with NoSQL that defines the data layout and the types of … Continue reading »
… because it’s simple, agnostic and an end-to-end solution. Wat? OK, let’s slow down a bit and go through the above keywords step by step. Simple Over 150 frameworks, libraries and tools directly support JSON in over 30 (!) languages. This might well be because the entire specification (incl. ToC, all the legal stuff and … Continue reading »