I have stumbled across a troubling phenomenon today. I have found that while eating salsa and tortilla chips, my behavior inevitably falls into a specific pattern. A pattern that is not so great for my health.
So I pour myself some salsa and chips and begin to enjoy. Before too long I run out of salsa. I say to myself, “Better get some more chips to go with the rest of this salsa”.
So I pour some more chips and the party in my mouth continues.
A few minutes later, I finally run out of chips. “Uh oh,” I say. “Better get some more salsa to go with the rest of these chips.”
So I get some more salsa.
Then I run out of salsa again…
See a pattern? I don’t know what to do about this yet, but the first step to fixing a problem is to recognize it. Therefore, I will henceforth refer to this conundrum as the Salsa-Chip Dilemma.
Interestingly, I have realized this pattern also takes place at a higher level: at the grocery store. Just today I bought a jar of salsa to go with the unfinished bag of tortilla chips that I knew I had at home. What will I do when I finish the chips before the salsa? I’ll buy more chips to go with the rest of the salsa I bought today.
If you’re interested in making yourself appear to be an intellectually superior badass and belittle your peers, here’s what you do:
- Become a coder
- Learn obscure language hacks that minimize readability and maximize references to internal language constructs and syntactic quirks.
- Use said tricks to “improve” your team’s codebase, making sure to indicate their superiority over any of your coworker’s ideas.
That’s the so-called “cast-to-bool” operator (which, mind you, is not actually an operator). It coerces any value to a boolean. Your inferior coworker probably wanted to use a straightforward and readable solution like
Boolean(), or (heaven forbid) not even do any type conversion at all and instead just let the
ifstatement perform the implicit type conversion. What can I say - this trade secret is not not the greatest thing since the linux kernel.
I’ll give you one more:
+ new Date()
Believe it or not, we’re not doing any addition here. That is the more obscure and less readable alternative to
Date.now(), which returns the current timestamp.
Now, I can’t take credit for either of these 2 freebies. The first was introduced to me by someone making a pull request to one of my old lame open source widgets or something. The second has over 3k upvotes on stack overflow.
One day not very long ago I was riding the bus. Most people on the buses here in Indianapolis keep to themselves, but occasionally I’ll find myself sitting next to a chatty one. This was one of those days.
“You know what I’m saying?” the man mumbled as he shook his head. I had no idea what conversation we were in or how we got there from the topic of reading books.
“I got a son, he works at Chase bank. And my daughter. She’s a paralegal. They make good money. They’re smart, you see.”
I nodded my head attentively. I was relishing this small part of the conversation that I could actually follow.
“I could make more money too. But higher paying jobs, they come with higher stress, you see. It’s a tradeoff. Each person gotta decide for themselves.”
“I myself, I’m just a janitor. I could be something else. I could make more money. But it would come with that higher stress, you know? I’d have to do more talking. I mean I can talk alright, like I’m talking to you right now. But it’s a different kinda talking. Not for me.”
Ethereum is a platform for creating decentralized web applications and network-enforced, code-based contracts. It’s young right now, but its most ardent advocates boldly proclaim that it will lead the way to a “web 3.0” and provide an alternative to formal legal systems for creating and enforcing organizations and contracts. What I intend to do here is provide some relevant information on what an Ethereum application (aka “Ðapp”) is, how it works, and most importantly: how to make one (assumes some experience with node.js and web development).
A brief overview of Ethereum
Before I go into the details of Ðapps, I want to include an introduction to Ethereum and contracts for those not familiar. At the heart of Ethereum is the concept of contracts. Contracts are a set of rules, described in computer code, which can be interacted with by users in the Ethereum network. Users can interact with contracts by invoking public methods defined on the contracts; the methods can receive data from the user and return data back to the user if desired. They can perform all sorts of manipulations on the data that they receive and can permanently store data in “storage” variables defined on the contract, which act in place of databases. In this sense, the Ethereum network can be described as a giant virtual machine, on which the contracts operate as virtual programs. (For those curious, this virtual machine is in fact Turing Complete.)
This virtual machine is enforced by a blockchain (a concept stolen unabashedly from Bitcoin), a public record of every change to the state of this machine. The blockchain is jointly maintained by all members of the Ethereum network (which anyone can join by running the program on their machine), making Ethereum decentralized. Thus there is no single point of failure for this network, making it robust against any attempts to compromise it.
Interacting with contracts
What this means to general the user is that they can download Mist or add Metamask to Chrome and then use the provided interfaces there to connect to their own Ethereum accounts. Once linked to at least one Ethereum account, the user can navigate to the url of a hosted Ðapp and interact with a contract through a web interface.
In summary, a Ðapp is essentially a glorified web application which can tap into the Ethereum network. Ðapps differ from normal web applications in 2 significant ways:
- A Ðapp interacts with the Ethereum network rather than a server.
- A Ðapp must be browsed by an enhanced web browser, as standard browsers do not provide access to the Ethereum network.
A word on the significance and limitations of contracts
It’s worth noting at this point that contracts have an interesting property: they cannot be modified. This means that any mistakes made in the contract are permanent; the only way to fix them is to deploy a brand new contract with the fixes. This is certainly an inconvenience, but it also adds an important property to the contracts: it makes them binding. Because the code cannot be modified, the rules of the contract are strictly enforced by the Ethereum network itself. Ethereum proponents suggest that this can be used to create “autonomous organizations” which use Ethereum contracts to enforce rules and “smart contracts” which can be used to enforce payoffs for a bet.
All of this is certainly possible from a technical perspective. Making contracts useful in the real world, however, is more difficult. This is because contracts are just pieces of code, which can only access data within the Ethereum network. Ethereum on its own does not access data from the outside world, and certainly does not possess the capability to interpret that data in any meaningful way (such as verifying the result of a sports game and executing payoffs to the winners of a bet based on that result). Real world data can be fed into the network, but the contracts themselves need to handle all the complexity of validating and interpreting the data, or rely on an external mechanism to do so. This turns out to be a difficult problem, one which requires an entirely separate mechanism to be solved. I will not discuss this problem any further here as my knowledge of it is quite limited, but I did want to mention it.
Let’s make one!
Enough idle talk. Let’s build a Ðapp.
Note that familiarity with node.js and some web development experience is assumed here. I’m also assuming you’re developing on OSX or Linux.
Allow me to introduce some things which we’ll be using.
Truffle: Truffle is a development framework for building Ðapps which provides some nice functionality, such as management of contract deployment, abstractions for interacting with contracts, and (my personal favorite feature) testing support for contracts.
Solidity: Solidity is actually a programming language. It is a language designed specifically for Ethereum contracts, and as such compiles to byte code which can be deployed directly to the Ethereum network. Solidity is the language of choice for most people writing Ethereum contracts and is the language supported by Truffle.
testrpc: Testrpc is a node.js module that provides a fake Ethereum network which can be run locally. This is useless for production, but very handy for development. You can deploy your contracts to testrpc and interact with them as if they were deployed to Ethereum. It’s a nice sandbox.
Metamask: As mentioned previously, Metamask is just a Chrome extension which allows you to browse a Ðapp. I had no luck with Mist when developing locally (possibly an incompatibility with testrpc), but Metamask worked like a charm. I recommend using Metamask.
1. Install testrpc
The first thing you’ll want to do is install testrpc. Although I prefer to install npm packages locally, the simplest way to get started is to install it globally:
npm install -g ethereumjs-testrpc. Note that testrpc requires node version 6.9.1 or greater. I personally use nvm to easily switch between node versions.
2. Install truffle
Again, the simplest way to get Truffle is to install it as a global npm package:
npm install -g truffle.
3. Create your project directory
Create a folder for your project,
cdinto it, and then run 2 commands:
mkdir build && mkdir build/contracts(creates missing directory needed for builds)
This will create a basic folder structure for your project and initialize it as a bare-bones demo application.
4. Try running the demo application
Try running Metacoin, the demo application. Run the following commands:
testrpc(in a separate terminal) - Start the fake Ethereum network. It will run on port 8545. Make sure you’re running node v6.9.1 or greater, or this won’t work.
truffle migrate(in your project root directory) - Compiles and deploys your contract to testrpc (which must be running).
truffle serve- Serves your web frontend.
Then setup Metamask:
- Install the Chrome extension.
- Unlock your vault in Metamask. You can do this by copying the mnemonic that should have been displayed in the terminal after booting testrpc. It will look something like this:
february range tired caught talk ripple shrimp interest across exist blur organ. Click “Restore existing vault” in Metamask and paste the mnemonic. You can set the password to whatever you like.
- Switch to the network
localhost 8545in Metamask (at the time of this writing, the current network is displayed at the top left of the main screen in Metamask).
If all goes well, you should be able to navigate to
http://localhost:8080in Chrome and interact with the trivial Metacoin app. Try sending some coin to an address. (You can look at the output from when you started testrpc to get a list of addresses that were generated.)
5. Make it your own
That was easy enough, right? Now try implementing your own idea. It can be stupid simple and trivial. Here are the steps you’ll probably want to take:
Write the contract in Solidity. The Solidity Browser is a nice tool for manually testing out your contracts as you write them. For formal automated tests, take a look at the tests in your demo project. Automated tests are wonderful and Truffle provides testing out of the box. I highly recommend setting aside time to build at least 1 or 2 tests for your contract.
If you would like to see another example, you are welcome to take a look at my first Ðapp project: https://github.com/tyleryasaka/EtherCred.
I went beyond the boilerplate example in that project, so it might be useful as an example of a more complex use of Truffle. Some of the things I did include:
- Using a more complex file structure
- Using a CSS framework
- Integrating Webpack for a customized build
- Creating an external script which seeds the test network with some sample data for development
Things to think about
- How does the blockchain really work? How robust is it? What would it take to bring it down?
- How can we bridge the gap between Ethereum and the real world in a way that is resistant to fraud?
- What is the future of the decentralized web? Will it take off? Could “smart contracts” eventually replace traditional legal contracts entirely?
Apparently, “data lakes” are all the rage these days among big data peoples. Here I take a brief look at what they are, when to use them, and things to consider about them. (Disclaimer: I have no experience using data lakes at the time of writing.)
What they are
(This is a rough overview. See the sources at the end of this post if you to learn more about the specifics.)
Data lakes are places to store large quantities of data in its “natural” state. This is in contrast to typical data storage mechanisms (like databases) which require data to be input in a very specific format. In a SQL database, for example, each record in a table is required to have precisely the same number of columns as all the other records in that table. Even in a “nosql” database like MongoDB, where each document can have a completely different set of fields, the data is still required to be in a very strict JSON format.
Contrast this with data lakes, where data is merely dumped in as it is, with little or no attempt to convert it to any sort of format to be usable in the future. This is done to preserve the data in a raw format, so that absolutely no information is lost when it is saved. The idea is that the data can be converted to a more useful format at a later phase.
When to use them
Data lakes are great for “big data” operations that generate large amounts of potentially interesting data. The data might be in various formats (JSON, plain text, jpg, etc.), and it would be imposible to know how the data might be used in the future, much less how to convert it to a useful format. By storing it in this “lake”, the task of data organization can be postponed to a later phase.
To be useful, the data must (eventually) become readable
Data is only useful when it is in a readable format. Storing data in a big “lake” is fine and dandy, but it is important to keep in mind that the data is not going to magically organize itself and spit out the answers we want to hear (unless that answer is “Forty-two”). This is work that will either have to be done by a human or by an automated process of some sort. The more disordered the data, the more work will be involved in understanding it (and the more intelligent an automated process will need to be).
Don’t forget security
The general idea of a data lake raises some security concerns. If you are indiscriminately dumping data by the terabytes into a single free-for-all repository, you want to make sure sensitive information like credit card numbers and passwords don’t get anywhere near it. If confidential data is accidentally dripped into the lake, I imagine it would be very difficult to “decontaminate”. The issue becomes even more complex when you take into account permission hierarchies. There might be a few superusers with complete access to all data; there might be some people with “dump” (write) acccess; and there might be some people with permissions to read only some of the data. If the data is supposed to be raw and untampered, how do we enforce these permissions? (Rhetorical question.)
- Investigate actual implementations of data lakes such as hadoop
- Think about how data lakes can be combined with data analysis techniques such as unsupervised machine learning as a way to explore the data
subscribe via RSS