Hortonworks seems to have done a good job on offering Hadoop tutorials, leveraging a Sandbox. In particular, they offer a real-time event stream with Apache Kafka which is then persisted into HBase and Hive using Storm Bolt. Food for thought
In todays markets, there are a number of drives that influence the buy vs build of a pricing engine irrespective of asset class. Those drives probably include the following:
- Consistent pricing
- Minimising the number of trade away %
- Ability to on-boarding clients in a timely manner
- High availability, low latency of both price construction, and price delivery
This leads to a number of requirements which i suspect maybe mandatory in certain organisations:
- Liquidity steam per user (not client) sourced from a liquidity pool – historical 3-5 tier pricing probably isn’t good enough these days
- High availability – consistent pricing from multi data centres
You’ll always hear about the Definition of Done (DOD) from agile teams. However, what is not so often discussed is Definition of Ready (DOR) – sometimes referred to as DevReady in various organisations. Fabrique’s SXSW 2013 slide deck (53) offers a view of DOR. The Agile Alliance crystallises why DOR is important:
Avoids beginning work on features that do not have clearly defined completion criteria, which usually translates into costly back-and-forth discussion or rework
Churn during iteration due to imprecise requirements is often the by-product of not having a strict DOR.
Cloudera offer a number of interesting blog articles on Hadoop security – Impala using Apache Sentry. From an enterprise perspective, creation of a data lake will inevitably require a clean security approach from day 1.
Test-first methods can be further divided into Test-Driven Development (TDD) and Acceptance-Test Driven Development (ATDD). Both are supported by Test Automation, which is required in support of Continuous Integration, team velocity, and development efficiency.
Beauty of Testing (Steven) offers a few home truths which unfortunately are seemingly lost in the rush to make a production release by people not aware of the complexity of software engineering (and it not being an exact science):
WallStreet & Technology offers a view on Single Dealer Platforms (SDP). Over the last few years there have been a number of articles stating that SDP’s will either take over the world, or are a dying breed. The net out of this latest article is unsurprisingly a view that Single Dealer Platforms and Multi Dealer Platforms will be combined by either banks or clients to offer a subset of services based on the clients needs – best of breed (effectively a type of mashup).
Great post from Slava @ReThinkDB. Some great lessons. 32 – news doesn’t age well! 21 – Leadership is important
Further, Slava offers some view on project management – of particular note, treat big projects with care. Of particular importance:
build the absolute most minimal version that solves the customer’s problem.
Few relevant postings to read:
- Domain-driven Design Example
- Rebuilding guardian.co.uk With DDD
- Domain Driven Design and Development In Practice
- An Introduction to Domain Driven Design
A colleague pointed me at a few interesting articles (Replicant) centred around state replication. For a long time I’ve been interested in state replication – I’ve blogged here and there about it over the years. The ability for each replicate to accept change, and propagate change to all replicas ensuring consistency is an interesting problem to solve. Such solutions off interesting scalability options in distributed applications.
Mencius offer more of what I’m interested – single/multi leader
Continuing with Vaughn’s book, page 22 table 1.4 offer a nice view of writing code that models the business – physical and conceptual domains. Page 25 hits the nail on the head with reference to software engineers not being able to pursue technologies and techniques just because they sound cool – well said Mr Vernon!
Page 33 touches on the issues of data-centric models, compared to business domain behaviours. All to often we see in code bases that the data payloads have become some warped domain model within the code base, leaking though every layer possible.
InfoQ offers an interesting read on agile planning and estimation. Here are a few key quotes
All numbers are based on assumptions. Your figures are only as good as your estimates and guesses – and there may be serious flaws in your assumptions.
If we approve a project based on a year’s worth of work, and we estimate it and we think it is all good, there is a lot of risk inherent in there. But also, and this is worse, we might ditch doing something really, really valuable for the company because we have clumped together all this value into a year-long project and we are making a really big decision – to go or no go – based only on that clump.
“Behind the Curtain of the HealthCare.gov Rollout” offer a few excellent quotes. I particularly like the quote around the number of concurrent users:
After the launch, HHS officials sharply criticized CMS’s management leading up to the launch of Healthcare.gov. Referencing an email in which a CMS official admits the system could not handle more than 500 concurrent users, Mr. Baitman wrote “Frankly, it’s worse than I imagined!” and Mr. Sivak replied, “Anyone who has any software experience at all would read that and immediately ask what the fuck you were thinking by launching.”
Oracle GoldenGate and HotCache offer an interesting (but expensive) solution to solving certain classes of problem. Assuming an application stack that is Java, leveraging Oracle Coherence, backed by an Oracles database, where the application stack is installed in two locations, with the requirements for bi-directional replication of data, so each application sees the others data, and can perform actions on the data. GoldenGate leveraging HotCache provides a mechanism to achieve replication, with the added benefit that not only does the data get replicated between databases (GoldenGate), but that Coherence state is updated as well (HotCache).
- Parser combinators FTW
- Playing with Scala Parser Combinator
- External DSLs made easy with Scala Parser Combinators
- Parsing sentences using Scala parser combinator
Eric’s Domain-Driven Design (DDD) book is unfortunately not seen enough on software engineers desks. Hopefully, Implementing Domain-Driven Design will increase the usage of DDD. Having started to read the book on the last flight, I think Vaughn makes a number of comments in the early pages that are spot on. Specifically:
“General abandonment of good design and development practices in the Java community” (page xxvii)
“Scrum and other agile techniques are being used as a substitute for careful modelling, where a product backlog is thrust at developers as if it serves as a set of designs” (page xxvii)
BlackRock’s Aladdin is an interesting platform, offering dedicated client environments amongst other things. Anyone know what the codebase is written in? Coupled of interesting reads on Aladdin:
- you must not change the tag numbers of any existing fields.
- you must not add or delete any required fields.
- you may delete optional or repeated fields.
- you may add new optional or repeated fields but you must use fresh tag numbers (i.e. tag numbers that were never used in this protocol buffer, not even by deleted fields).
Market share is however the influencer. Could Google copy the concept, and with its market share?
Some of us started on these, learning CECIL.