Cohesion via Communal Data Platforms: A Manifesto
August 21, 2021 § Leave a comment
Draft 1: Aug 21, 2021
- Our most precious resource as a community is our ability to make better decisions together (“cohesion”)
- Better data (fresher/more accurate/ more digestible) enables better decisions
- Data Platforms are the ecosystems of people, pipelines, and policies designed to provide us better data
- Our cohesion is limited by how fast our shared Platform can adapt to new requirements
- This implies the core Platform must be built around open source and open standards, so we control our own evolution
- Vendors may be used for well-defined integrations and innovative interfaces, where they can’t constrain the overall architecture
Psycho-Analytic Engineering (Coalesce 2021)
June 6, 2021 § Leave a comment
Using Data to Differentiate Our Selves
Keynote Talk Proposal for Coalesce 2021
Based on “DBT as Organizational Therapy“
« Read the rest of this entry »DBT as the “Couch” for Organizational Therapy
May 13, 2021 § 1 Comment
Or, “How ELTT is the Key to World Peace”
Draft Submission Script for Coalesce 2021 « Read the rest of this entry »
Configuring DataBricks on AWS
May 5, 2021 § Leave a comment
Despite the excellent QuickStart tools, this was way harder than I thought. For some reason I had the worst difficulty creating a Workspace on AWS for Databricks.
Here are some tips that might help others who get stuck.
« Read the rest of this entry »SyncHouse: MVC for Enterprise SaaS
May 2, 2021 § Leave a comment
A concrete proposal for Imagining a Data Resort as enforcing a Model-View-Controller architecture across multiple Software-as-a-Service applications. The key is replacing transient enterprise data integrations with a persistent “sync house,” and making that the one full-service Source of Truth for data, schemas, and business logic.
- Ingest data from Salesforce, NetSuite, etc. (e.g.,
Stitch/Talend, FiveTran) - Store raw data in a LakeHouse (e.g., Databricks, Delta Lake; or just Redshift)
- Aka “ELT vs ETL“
- Manage schemas via dbt (e.g., dbt Cloud)
- View and report on appropriate data (e.g., Mode, Data Studio)
- Push updates (reverse ETL) back to source applications (e.g.,
Celigo, Get Census)
Imagining a Data Resort
April 21, 2021 § 1 Comment
A data resort is where data comes to get pampered, so that it is prepared to get back to work.
Motivation
The good news is that I finally understand how we really need to be managing all the business data in my organization. The bad news is that I don’t know how to articulate that in terms of industry-standard terminology (examples below). Worse, I’d probably use the wrong term (or the right term incorrectly) leading to endless rounds of frustrating conversation.
Therefore I’ve coined a new term (“data resort“) which I can define as the exact thing I want. Hopefully you my dear readers can help translate that into something concrete I can efficiently buy+build today!
Let me know your suggestions in the comments or via email.
« Read the rest of this entry »Become Like a Billionaire
December 10, 2020 § Leave a comment
- Obsess over a Wildly Important Problem that has not been properly characterized
- Identify a novel point of technological leverage for solving that problem
- Discover a market hurting enough to pay for even a crappy solution to that problem
- Iterate and improve on all the above until you die, fully solve the problem, or hand it over to someone who can do better.
My First Date with Quilt Data
July 21, 2020 § Leave a comment
I’ve known the good folks at Quilt Data for a long time. A company hackathon gave me a good excuse to actually use them “in anger” for an actual demo. These are my notes on how to configure quilt3 and create my first package (and panda data frame) from a CSV
« Read the rest of this entry »SSO Login into Salesforce from Node via samlp SAML IdP
October 4, 2019 § Leave a comment
Documenting this in a blog post because it drove us crazy trying to figure out exactly what was involved, even though it was actually easy to implement once we understood all the terminology.
In order for our previously-authenticated users to automatically log into Salesforce, we needed to:
- Create a “/sso-url” on our node server for our web app to access
- When our web app GETs that URL, create and a return a SAML Identity Provider (IdP) using
samlp
- That IdP is interpreted by the web browser a redirect to the Salesforce URL (returned by the function assigned to `getPostURL`)
- Salesforce just needs to have the IdP certificate and Entity ID in its SSO Settings
Below are additional details on why we needed this.
Book Review: Quantum Philosophy and the End of Education
March 11, 2019 § 1 Comment
Quantum Philosophy and the End of Education, by Roo Pavan (self-published)
April 1st, 2019
This self-published book by a retired physicist turned tech millionaire has taken the education establishment by storm — and not in a good way. Few people had even heard of this book or its author, Roo Pavan, until President Trump mentioned it approvingly in a tweet. It is doubtful whether our Esteemed Leader actually read the book, but that didn’t stop him from claiming he would use it as the blueprint for education policy in his second term. Like most of the book’s critics, he probably only read the sensationalist claims in the final chapter rather than the surprisingly thoughtful analysis that preceded it.
Which is a shame, because that would have been a conversation worth having. The author’s main thesis is contrarian but hardly new: that Western philosophy in general — and higher education in particular — are more about perpetuating a cultural elite than actually pursuing truth and serving society, though he concedes that those have often been a useful byproduct.
You must be logged in to post a comment.