The European Spark Summit took place on October 25-27 in Brussels. Over 1.000 Spark enthusiasts gathered to attend training and listen to keynotes from Matei Zaharia, Ion Stoica, and Andy Steinbach.
This year, GoDataDriven was asked to deliver training and to do a key note presentation. Needless to say, we were honored and took on this opportunity with two hands.
Spark Summit Training Day
On the first day of the Summit, training day, Andrew Snare geared up to explore Wikipedia using Spark and teach the 100 participants in the room a thing or two. Luckily, Andrew was joined by three training assistants, including Kris Geusebroek. The TA's made sure that Andrew could focus on the training, while the TA's took care of any question or remark from the participants.
Kris Geusebroek remarked: "First of all, it was great to meet the people behind Databricks. The training went well, as a TA I did not have to sit still, but that made the effort rewarding in the end. Even more thrilling was the positive feedback of the participants, which, I must say, was a great accomplishment by Andrew as a trainer, and the rest of the TA's".
Spark Summit - The Conference
The first day of the two-day conference had a focus on developers, while the second day had a focus on the enterprise. This separation is part of the general theme that became apparent during these two days: Spark has become part of the core of Big Data and Data Science tooling, and now the focus has shift from what we can do with it, to how we can create value with it.
The two days featured awesome keynotes, including one which featured beer (did you know that The Netherlands now outnumbers Belgium when it comes to breweries) and Max Verstappen in the same talk! Yes, this was the keynote performed by our very own COO, Renald Buter.
Experiences at the Spark Summit
Quite a few consultants from GoDataDriven attended the Spark Summit. The general experience was a very positive one, with lots of information and fresh insights. For Bas Harenslak this was his first conference.
"The developer day was a great learning experience. I followed mostly sessions on testing, monitoring and debugging Spark and learned about useful tricks and tools such as Vegas (Vega visualisation + Scala), SparkLint (monitoring tool for Spark jobs) and Spark profiling with flame graphs", says Bas. "The second day was the enterprise day, although I prefer the developer topics, it was still an interesting day with talks on structured streaming, containerised Spark and of course Renald’s keynote!"
A recurring topic in several talks was the availability of whole-stage codegen in Spark 2.0 for improving execution performance. It would have been good to have more presentations with Structured Streaming as a topic, since it was released recently with Spark 2.0. Besides the technical stuff, the conference was well organised with nice food and drinks.
Jelte Hoekstra attended as well. "Many presentations were mostly focused on first use of Spark, for example migrating to Spark from Hive or a small data set-up. ETL is definitely a vital aspect of data science, but personally, I would say: more distributed machine learning! Perhaps on a next Summit, they could try different formats as an addition to just presentations, that would be nice!"
This blog has been published on the website of GoDataDriven.
14 en 15 mei 2025 Organisaties hebben behoefte aan data science, selfservice BI, embedded BI, edge analytics en klantgedreven BI. Vaak is het dan ook tijd voor een nieuwe, toekomstbestendige data-architectuur. Dit tweedaagse seminar geeft antwoord op...
19 t/m 21 mei 2025Praktische driedaagse workshop met internationaal gerenommeerde trainer Lawrence Corr over het modelleren Datawarehouse / BI systemen op basis van dimensioneel modelleren. De workshop wordt ondersteund met vele oefeningen en praktij...
20 en 21 mei 2025 Deze 2-daagse cursus is ontworpen om dataprofessionals te voorzien van de kennis en praktische vaardigheden die nodig zijn om Knowledge Graphs en Large Language Models (LLM's) te integreren in hun workflows voor datamodelleri...
22 mei 2025 Workshop met BPM-specialist Christian Gijsels over AI-Gedreven Business Analyse met ChatGPT. Kunstmatige Intelligentie, ongetwijfeld een van de meest baanbrekende technologieën tot nu toe, opent nieuwe deuren voor analisten met innovatie...
17 t/m 19 november 2025 De DAMA DMBoK2 beschrijft 11 disciplines van Data Management, waarbij Data Governance centraal staat. De Certified Data Management Professional (CDMP) certificatie biedt een traject voor het inleidende niveau (Associate) tot...
Alleen als In-house beschikbaar Het Logical Data Warehouse, een door Gartner geïntroduceerde architectuur, is gebaseerd op een ontkoppeling van rapportage en analyse enerzijds en gegevensbronnen anderzijds. Een flexibelere architectuur waarbij snell...
Deel dit bericht