Skip to content
Change the repository type filter

All

    Repositories list

    • pixels

      Public
      Facilitates simple large scale processing of HLS Medical images, documents, zip files. Previously at https://github.com/dmoore247/pixels
      JavaScript
      Other
      16000Updated Dec 5, 2024Dec 5, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      27k000Updated Dec 5, 2024Dec 5, 2024
    • remorph

      Public
      Cross-compiler and Data Reconciler into Databricks Lakehouse
      Scala
      Other
      33000Updated Dec 5, 2024Dec 5, 2024
    • Public runnable examples of using John Snow Labs' NLP for Apache Spark.
      Jupyter Notebook
      Apache License 2.0
      603000Updated Dec 5, 2024Dec 5, 2024
    • State of the Art Natural Language Processing with John Snow Labs
      Scala
      Apache License 2.0
      714000Updated Dec 5, 2024Dec 5, 2024
    • This repository contains code example used and shared through Databricks Blog posts
      Python
      Other
      2000Updated Dec 5, 2024Dec 5, 2024
    • This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
      Python
      Apache License 2.0
      165000Updated Dec 5, 2024Dec 5, 2024
    • composer

      Public
      Supercharge Your Model Training
      Python
      Apache License 2.0
      422000Updated Dec 5, 2024Dec 5, 2024
    • Gen AI application to estimate of the cost of payer treatment, service or procedure
      Python
      Other
      1000Updated Dec 5, 2024Dec 5, 2024
    • mosaic

      Public
      An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
      Jupyter Notebook
      Other
      69000Updated Nov 25, 2024Nov 25, 2024
    • dbdemos

      Public
      Demos to implement your Databricks Lakehouse
      HTML
      Other
      99000Updated Nov 25, 2024Nov 25, 2024
    • The Security Reference Architecture (SRA) implements typical security features as Terraform Templates that are deployed by most high-security organizations, and enforces controls for the largest risks that customers ask about most often.
      HCL
      Other
      44000Updated Nov 25, 2024Nov 25, 2024
    • DataOps for the Modern Data Warehouse on Microsoft Azure. https://aka.ms/mdw-dataops.
      Shell
      MIT License
      470000Updated Nov 25, 2024Nov 25, 2024
    • This repo provides learning materials and production-ready code to build a high-quality RAG application using Databricks.
      Python
      Other
      78100Updated Nov 25, 2024Nov 25, 2024
    • LLM training code for MosaicML foundation models
      Python
      Apache License 2.0
      532000Updated Nov 25, 2024Nov 25, 2024
    • Examples of Databricks Asset Bundles
      Python
      Other
      35000Updated Nov 25, 2024Nov 25, 2024
    • Generative AI data curation and model patterns that take advantage of publicly available BioMedical articles.
      Python
      Other
      3000Updated Nov 25, 2024Nov 25, 2024
    • tempo

      Public
      API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
      Jupyter Notebook
      Other
      53000Updated Nov 21, 2024Nov 21, 2024
    • LLM Bootcamp Series
      Python
      52000Updated Nov 21, 2024Nov 21, 2024
    • anomalib

      Public
      An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
      Python
      Apache License 2.0
      690000Updated Nov 11, 2024Nov 11, 2024
    • Security Analysis Tool (SAT) analyzes customer's Databricks account and workspace security configurations and provides recommendations that help them follow Databrick's security best practices. When a customer runs SAT, it will compare their workspace configurations against a set of security best practices and delivers a report.
      Python
      Other
      41000Updated Nov 11, 2024Nov 11, 2024
    • HLS RAG Chatbot Workshop
      Python
      1000Updated Nov 11, 2024Nov 11, 2024
    • PDF files ETL, parsing and vector search hosting
      Python
      Apache License 2.0
      2000Updated Nov 11, 2024Nov 11, 2024
    • Demonstrates how to use various generative AI forecasting models from within Databricks.
      Python
      Other
      6000Updated Nov 4, 2024Nov 4, 2024
    • hub

      Public
      A library for transfer learning by reusing parts of TensorFlow models.
      Python
      Apache License 2.0
      1.7k000Updated Oct 24, 2024Oct 24, 2024
    • smolder

      Public
      HL7 Apache Spark Datasource
      Scala
      Apache License 2.0
      21000Updated Oct 23, 2024Oct 23, 2024
    • lightseq

      Public
      Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
      Python
      9000Updated Oct 20, 2024Oct 20, 2024
    • Sample code for using various vector databases within Databricks
      Python
      Apache License 2.0
      1000Updated Oct 10, 2024Oct 10, 2024
    • sfdc-byom

      Public
      Modelling Databricks and Salesforce data to help your customers and improve your business outcomes
      Python
      Other
      1000Updated Oct 8, 2024Oct 8, 2024
    • Bootstrap your large scale forecasting solution on Databricks with Many Models Forecasting (MMF)
      Python
      Other
      18000Updated Oct 8, 2024Oct 8, 2024