Data + AI Summit Europe 2020 原 Spark + AI Summit Europe 于2020年11月17日至19日举行。由于新冠疫情影响,本次会议和六月份举办的会议一样在线举办,一共为期三天,第一天是培训,第二天和第三天是正式会议。会议涵盖来自从业者的技术内容,他们将使用 Apache Spark™、Delta Lake、MLflow、Structured Streaming、BI和SQL分析、深度学习和机器学习框架来解决棘手的数据问题。会议的全部日程请参见:https://databricks.com/dataaisummit/europe-2020/agenda。

和今年六月份会议不一样,这次会议的 KeyNote 没什么劲爆的消息,不过会议的第二天和第三天还是有些干货大家可以看下的。在接下来的几天,本公众号也会对一些比较有意思的议题进行介绍,敬请关注本公众号。

本次会议的议题范围具体如下:

人工智能用户案例以及新的机会;Apache Spark™, Delta Lake, MLflow 等最佳实践和用户案例;数据工程,包括流架构使用数据仓库(data warehouse)和数据湖(data lakes)进行 SQL 分析和 BI;数据科学,包括 Python 生态系统;机器学习和深度学习应用生产机器学习(MLOps)大规模数据分析和ML研究工业界的用户案例 

下载途径

关注微信公众号 过往记忆大数据 或者 Java与大数据架构 并回复 spark-9902 获取。

可下载的PPT

下面议题提供 PPT 下载,共129个。注意,访问 https://www.iteblog.com/archives/9902.html 页面可以在线观看全部 PPT。

3D: DBT using Databricks and DeltaAccelerated Training of Transformer ModelsAchieving Lakehouse Models with Spark 3.0Acoustics & AI for ConservationActive Governance Across the Delta Lake with AlationAdd Historical Analysis of Operational Data with Easy Configurations in Fivetran Automated Data IntegrationAdvanced Natural Language Processing with Apache Spark NLPApache Liminal (Incubating)—Orchestrate the Machine Learning PipelineApache Spark Streaming in K8s with ArgoCD & Spark OperatorApply MLOps at ScaleArbitrary Stateful Aggregation and MERGE INTOBank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Building a Cross Cloud Data Protection EngineBuilding a Distributed Collaborative Data Pipeline with Apache SparkBuilding a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few MinutesBuilding a Real-Time Supply Chain View: How Gousto Merges Incoming Streams of Inventory - - Data at Scale to Track Ingredients Throughout its Supply ChainBuilding a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a Streaming Data Pipeline for Trains Delays ProcessingBuilding a Streaming Microservices ArchitectureBuilding an ML Tool to predict Article Quality Scores using Delta & MLFlowBuilding Identity Graph at Scale for Programmatic Media Buying Using Apache Spark and Delta LakeBuilding Notebook-based AI Pipelines with Elyra and KubeflowBuilding the Next-gen Digital Meter Platform for FluviusCI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksCloud-native Semantic Layer on Data LakeCommon Strategies for Improving Performance on Your Delta LakehouseComprehensive View on Date-time APIs of Apache Spark 3.0Containerized Stream Engine to Build Modern Delta LakeContext-aware Fast Food Recommendation with Ray on Apache Spark at Burger KingContinuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI ScenariosCost Efficiency Strategies for Managed Apache Spark ServiceData Engineers in Uncertain Times: A COVID-19 Case StudyData Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data LakeData Privacy with Apache Spark: Defensive and Offensive ApproachesData Time Travel by Delta Time MachineData Time Travel by Delta Time MachineData Versioning and Reproducible ML with DVC and MLflowDatabricks University Alliance Meetup - Data + AI Summit EU 2020Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our CustomersDelta: Building Merge on ReadDelta Lake: Optimizing MergeDesigning and Implementing a Real-time Data Lake with Dynamically Changing SchemaDetecting and Recognising Highly Arbitrary Shaped Texts from Product ImagesDeterministic Machine Learning with MLflow and mlf-coreDeveloping ML-enabled Data Pipelines on Databricks using IDE & CI/CD at RuntasticDigital Turbine Adopts A Lakehouse to Scale to Their Analytics NeedsDistributed and Scalable Model Lifecycle CapabilitiesDiving into Delta Lake: Unpacking the Transaction LogeBay’s Work on Dynamic Partition Pruning & Runtime FilterEfficient Query Processing Using Machine LearningEmbedding Insight through Prediction Driven LogisticsEnd to End Supply Chain Control TowerExtending Apache Spark – Beyond Spark Session ExtensionsFoundations of Data TeamsFrequently Bought Together Recommendations Based on EmbeddingsFrom Query Plan to Query Performance: Supercharging your Apache Spark Queries using the Spark UI SQL TabFrom Zero to Hero with Kafka ConnectGeneralized Pipeline Parallelism for DNN TrainingGetting Started with Apache Spark on KubernetesHeterogeneity-Aware Cluster Scheduling Policies for Deep Learning WorkloadsHow a Media Data Platform Drives Real-time Insights & Analytics using Apache SparkHow The Weather Company Uses Apache Spark to Serve Weather Data Fast at Low CostImproving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and Parquet ReaderIntroducing MLflow for End-to-End Machine Learning on DatabricksKoalas: Interoperability Between Koalas and Apache SparkLeveraging Apache Spark and Delta Lake for Efficient Data Encryption at ScaleLivestream Economy: The Application of Real-time Media and Algorithmic Personalisation in UrbanismMaterialized Column: An Efficient Way to Optimize Queries on Nested ColumnsMATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestration of Machine Learning PipelinesMIDAS: Microcluster-Based Detector of Anomalies in Edge StreamsMigrate and Modernize Hadoop-Based Security Policies for DatabricksMigrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayML Production Pipelines: A Classification ModelML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed Feedback EnvironmentMLflow at Company ScaleMLOps Using MLflowModel Experiments Tracking and Registration using MLflow on DatabricksMonitoring Half a Million ML Models, IoT Streaming Data, and Automated Quality Check on Delta LakeMoving to Databricks & DeltaNLP Text Recommendation System Journey to Automated TrainingOperating and Supporting Delta Lake in ProductionOptimising Geospatial Queries with Dynamic File PruningOptimizing Apache Spark UDFsOur Journey to Release a Patient-Centric AI App to Reduce Public Health CostsParallel Ablation Studies for Machine Learning with Maggy on Apache SparkPersonalization Journey: From Single Node to Cloud StreamingPhoton Technical Deep Dive: How to Think VectorizedPolymorphic Table Functions: The Best Way to Integrate SQL and Apache SparkPresto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch and More!)Productionizing Real-time Serving With MLflowProject Zen: Improving Apache Spark for Python UsersQuery or Not to Query? Using Apache Spark Metrics to Highlight Potentially Problematic QueriesRay and Its Growing EcosystemReal-time Feature Engineering with Apache Spark Streaming and HofReal-Time Health Score Application using Apache Spark on KubernetesReproducible AI Using PyTorch and MLflowReproducible AI Using PyTorch and MLflowRevealing the Power of Legacy Machine DataScale and Optimize Data Engineering Pipelines with Software Engineering Best Practices: Modularity and Automated TestingScale-Out Using Spark in Serverless Herd Mode!Scaling Machine Learning Feature Engineering in Apache Spark at FacebookScaling Machine Learning with Apache SparkSeamless MLOps with Seldon and MLflowSHAP & Game Theory For Recommendation SystemsSimplifying AI integration on Apache SparkSkew Mitigation For Facebook PetabyteScale JoinsSolving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metadata PlatformSpark NLP: State of the Art Natural Language Processing at ScaleSpark SQL Beyond Official DocumentationSpark SQL Join Improvement at FacebookSpeeding Time to Insight with a Modern ELT ApproachStateful Streaming with Apache Spark: How to Update Decision Logic at RuntimeStories from the Financial Service AI Trenches: Lessons Learned from Building AI Models in EYStreaming Inference with Apache Beam and TFXTeraCache: Efficient Caching Over Fast Storage DevicesThe Beauty of (Big) Data Privacy EngineeringThe Hidden Value of Hadoop MigrationThe Modern Data Team for the Modern Data Stack: dbt and the Role of the Analytics EngineerThe Pill for Your Migration HellTransforming GE Healthcare with Data Platform StrategyTrust, Context and, Regulation: Achieving More Explainable AI in Financial ServicesUnlocking Geospatial Analytics Use Cases with CARTO and DatabricksUsing Delta Lake to Transform a Legacy Apache Spark to Support Complex Update/Delete SQL OperationUsing Machine Learning at Scale: A Gaming Industry Experience!Using Machine Learning at Scale: A Gaming Industry Experience!Using NLP to Explore Entity Relationships in COVID-19 LiteratureUsing Redash for SQL Analytics on DatabricksWhat is New with Apache Spark Performance Monitoring in Spark 3.0X-RAIS: The Third Eye


©著作权归作者所有:来自51CTO博客作者mob604756e9d3bc的原创作品,如需转载,请注明出处,否则将追究法律责任

更多相关文章

  1. 图文带你理解 Apache Iceberg 时间旅行是如何实现的?
  2. 一条数据在 Apache Iceberg 之旅:写过程分析
  3. 手把手教你从零开始用WordPress建站
  4. 爬了世纪佳缘后发现了一个秘密
  5. 数据机构之排序算法 ————快排
  6. 从行存储到 RCFile,Facebook 为什么要设计出 RCFile?
  7. 一文了解 Apache Hive 联邦查询(Query Federation)
  8. OLAP引擎:基于Druid组件进行数据统计分析
  9. Grafana 之 kubeGraf插件安装使用

随机推荐

  1. 分布式爬虫的部署之Gerapy分布式管理
  2. Scrapy框架的使用之Scrapy爬取新浪微博
  3. 分布式爬虫原理之分布式爬虫原理
  4. OpenCV:图像检索。
  5. 分布式爬虫原理之Scrapy分布式实现
  6. 各项工具大pk,分组聚合哪家强?
  7. 决策树学习笔记(一):特征选择
  8. NBA球员投篮数据可视化。
  9. OpenCV:边缘检测。
  10. 嫌pandas慢又不想改代码怎么办?来试试Modi