Powered by RND
PodcastTecnologiaHow AI Is Built

How AI Is Built

Nicolay Gerold
How AI Is Built
Ultimo episodio

Episodi disponibili

5 risultati 51
  • #047 Architecting Information for Search, Humans, and Artificial Intelligence
    Today on How AI Is Built, Nicolay Gerold sits down with Jorge Arango, an expert in information architecture. Jorge emphasizes that aligning systems with users' mental models is more important than optimizing backend logic alone. He shares a clear framework with four practical steps:Key Points:Information architecture should bridge user mental models with system data modelsInformation's purpose is to help people make better choices and act more skillfullyWell-designed systems create learnable (not just "intuitive") interfacesContext and domain boundaries significantly impact user understandingProgressive disclosure helps accommodate users with varying expertise levelsChapters00:00 Introduction to Backend Systems00:36 Guest Introduction: Jorge Arango01:12 Podcast Dynamics and Guest Experiences01:53 Timeless Principles in Technology02:08 Interesting Conversations and Learnings04:04 Physical vs. Digital Organization04:21 Smart Defaults and System Maintenance07:20 Data Models and Conceptual Structures08:53 Designing User-Centric Systems10:20 Challenges in Information Systems10:35 Understanding Information and Choices15:49 Clarity and Context in Design26:36 Progressive Disclosure and User Research37:05 The Role of Large Language Models54:59 Future Directions and New Series (MLOps)Information Architecture FundamentalsWhat Is Information?Information helps people make better choices to act more skillfullyExample: "No dog pooping" signs help predict consequences of actionsPoor information systems fail to provide relevant guidance for users' needsMental Models vs. Data ModelsSystems have underlying conceptual structures that should reflect user mental modelsData models make these conceptual models "normative" in the infrastructureDesigners serve as translators between user needs and technical implementationGoal: Users should think "the person who designed this really gets me"Design Strategies for Complex SystemsProgressive DisclosurePresent simple interfaces by default with clear paths to advanced functionalityExample: HyperCard - visual interface for beginners with programming layer for expertsAllows both novice and expert users to use the same system effectivelyContext Setting and Domain BoundariesAll interactions happen within a context that influences understandingWords acquire different meanings in different contexts (e.g., "save" in computing vs. banking)Clearer domain boundaries make information architecture design easierHardest systems to design: those serving many purposes for diverse audiencesConceptual Modeling (Underrated Practice)Should precede UI sketching but often skipped by designersDefines concepts needed in the system and their relationshipsCreates more cohesive and coherent systems, especially for complex projectsMore valuable than sitemaps, which imply rigid hierarchiesLLMs and Information ArchitectureCurrent and Future ApplicationsTransforming search experiences (e.g., Perplexity providing answers vs. link lists)Improving intent parsing in traditional searchHelping information architects with content analysis and navigation structure designEnabling faster, better analysis of large content repositoriesImplementation AdviceFor Engineers and DesignersDesigners should understand how systems are built (materials of construction)Engineers benefit from understanding user perspectives and mental modelsBoth disciplines have much to teach each otherFor Complex ApplicationsMap conceptual models before writing codeTest naming with real usersImplement progressive disclosure with good defaultsRemember: "If the user can't find it, it doesn't exist"Notable Quotes:"People only understand things relative to things they already understand." - Richard Saul Wurman"The hardest systems to design are the ones that are meant to do a lot of things for a lot of different people." - Jorge Arango"Very few things are intuitive. There's a long running joke in the industry that the only intuitive interface for humans is the nipple. Everything else is learned." - Jorge ArangoJorge ArangoLinkedInWebsiteX (Twitter)Nicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)
    --------  
    57:22
  • #046 Building a Search Database From First Principles
    Modern search is broken. There are too many pieces that are glued together.Vector databases for semantic searchText engines for keywordsRerankers to fix the resultsLLMs to understand queriesMetadata filters for precisionEach piece works well alone.Together, they often become a mess.When you glue these systems together, you create:Data Consistency Gaps Your vector store knows about documents your text engine doesn't. Which is right?Timing Mismatches New content appears in one system before another. Users see different results depending on which path their query takes.Complexity Explosion Every new component doubles your integration points. Three components means three connections. Five means ten.Performance Bottlenecks Each hop between systems adds latency. A 200ms search becomes 800ms after passing through four components.Brittle Chains When one system fails, your entire search breaks. More pieces mean more breaking points.I recently built a system where we had query specific post-filters but the requirement to deliver a fixed number of results to the user.A lot of times, the query had to be run multiple times to achieve the desired amount.So we had an unpredictable latency. A high load on the backend, where some queries hammered the database 10+ times. A relevance cliff, where results 1-6 look great, but the later ones were poor matches.Today on How AI Is Built, we are talking to Marek Galovic from TopK.We talk about how they built a new search database with modern components. "How would search work if we built it today?”Cloud storage is cheap. Compute is fast. Memory is plentiful.One system that handles vectors, text, and filters together - not three systems duct-taped into one.One pass handles everything:Vector search + Text search + Filters → Single sorted result Built with hand-optimized Rust kernels for both x86 and ARM, the system scales to 100M documents with 200ms P99 latency.The goal is to do search in 5 lines of code.Marek Galovic:LinkedInWebsiteTopK WebsiteTopK DocsNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to TopK and Snowflake Comparison00:35 Architectural Patterns and Custom Formats01:30 Query Execution Engine Explained02:56 Distributed Systems and Rust04:12 Query Execution Process06:56 Custom File Formats for Search11:45 Handling Distributed Queries16:28 Consistency Models and Use Cases26:47 Exploring Database Versioning and Snapshots27:27 Performance Benchmarks: Rust vs. C/C++29:02 Scaling and Latency in Large Datasets29:39 GPU Acceleration and Use Cases31:04 Optimizing Search Relevance and Hybrid Search34:39 Advanced Search Features and Custom Scoring38:43 Future Directions and Research in AI47:11 Takeaways for Building AI Applications
    --------  
    53:29
  • #045 RAG As Two Things - Prompt Engineering and Search
    John Berryman moved from aerospace engineering to search, then to ML and LLMs. His path: Eventbrite search → GitHub code search → data science → GitHub Copilot. He was drawn to more math and ML throughout his career.RAG Explained"RAG is not a thing. RAG is two things." It breaks into:Search - finding relevant informationPrompt engineering - presenting that information to the modelThese should be treated as separate problems to optimize.The Little Red Riding Hood PrincipleWhen prompting LLMs, stay on the path of what models have seen in training. Use formats, structures, and patterns they recognize from their training data:For code, use docstrings and proper formattingFor financial data, use SEC report structuresUse Markdown for better formattingModels respond better to familiar structures.Testing PromptsTesting strategies:Start with "vibe testing" - human evaluation of outputsDevelop systematic tests based on observed failure patternsUse token probabilities to measure model confidenceFor few-shot prompts, watch for diminishing returns as examples increaseManaging Token LimitsWhen designing prompts, divide content into:Static elements (boilerplate, instructions)Dynamic elements (user inputs, context)Prioritize content by:Must-have informationNice-to-have informationOptional if space allowsEven with larger context windows, efficiency remains important for cost and latency.Completion vs. Chat ModelsChat models are winning despite initial concerns about their constraints:Completion models allow more flexibility in document formatChat models are more reliable and aligned with common use casesMost applications now use chat models, even for completion-like tasksApplications: Workflows vs. AssistantsTwo main LLM application patterns:Assistants: Human-in-the-loop interactions where users guide and correctWorkflows: Decomposed tasks where LLMs handle well-defined steps with safeguardsBreaking Down Complex ProblemsTwo approaches:Horizontal: Split into sequential steps with clear inputs/outputsVertical: Divide by case type, with specialized handling for each scenarioExample: For SOX compliance, break horizontally (understand control, find evidence, extract data, compile report) and vertically (different audit types).On AgentsAgents exist on a spectrum from assistants to workflows, characterized by:Having some autonomy to make decisionsUsing tools to interact with the environmentUsually requiring human oversightBest PracticesFor building with LLMs:Start simple: API key + Jupyter notebookBuild prototypes and iterate quicklyAdd evaluation as you scaleKeep users in the loop until models prove reliabilityJohn Berryman:LinkedInX (Twitter)Arcturus LabsPrompt Engineering for LLMs (Book)Nicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to RAG: Retrieval and Generation00:19 Optimizing Retrieval Systems01:11 Introducing John Berryman02:31 John's Journey from Search to Prompt Engineering04:05 Understanding RAG: Search and Prompt Engineering05:39 The Little Red Riding Hood Principle in Prompt Engineering14:14 Balancing Static and Dynamic Elements in Prompts25:52 Assistants vs. Workflows: Choosing the Right Approach30:15 Defining Agency in AI30:35 Spectrum of Assistance and Workflows34:35 Breaking Down Problems Horizontally and Vertically37:57 SOX Compliance Case Study40:56 Integrating LLMs into Existing Applications44:37 Favorite Tools and Missing Features46:37 Exploring Niche Technologies in AI52:52 Key Takeaways and Future Directions
    --------  
    1:02:44
  • #044 Graphs Aren't Just For Specialists Anymore
    Kuzu is an embedded graph database that implements Cypher as a library.It can be easily integrated into various environments—from scripts and Android apps to serverless platforms.Its design supports both ephemeral, in-memory graphs (ideal for temporary computations) and large-scale persistent graphs where traditional systems struggle with performance and scalability.Key Architectural Decisions:Columnar Storage:Kuzu stores node and relationship properties in separate, contiguous columns. This design reduces I/O by allowing queries to scan only the needed columns, unlike row-based systems (e.g., Neo4j) that read full records even when only a subset of properties is required.Efficient Join Indexing with CSR:The join index is maintained using a Compressed Sparse Row (CSR) format. By sorting and compressing relationship data, Kuzu ensures that adjacent node relationships are stored contiguously, minimizing random I/O and speeding up traversals.Vectorized Query Processing:Instead of processing one tuple at a time, Kuzu processes blocks (vectors) of tuples. This block-based (or vectorized) approach reduces function-call overhead and improves cache locality, boosting performance for analytic queries.Factorization and ASP Join:For many-to-many queries that can generate enormous intermediate results, Kuzu uses factorization to represent data compactly. Its ASP join algorithm integrates factorization, sequential scanning, and sideways information passing to avoid unnecessary full scans and materializations.Kuzu is optimized for read-heavy, analytic workloads. While batched writes are efficient, the system is less tuned for high-frequency, small transactions. Upcoming features include:A WebAssembly (Wasm) version for running in browsers.Enhanced vector and full-text search indices.Built-in graph data science algorithms for tasks like PageRank and centrality analysis.Kuzu can be a powerful backend for AI applications in several ways:Knowledge Graphs:Store and query complex relationships between entities to support natural language understanding, semantic search, and reasoning tasks.Graph Data Science:Run built-in graph algorithms (like PageRank, centrality, or community detection) that help uncover patterns and insights, which can improve recommendation systems, fraud detection, and other AI-driven analyses.Retrieval-Augmented Generation (RAG):Integrate with large language models by efficiently retrieving relevant, structured graph data. Kuzu’s vector search capabilities and fast query processing make it ideal for augmenting AI responses with contextual information.Graph Embeddings & ML Pipelines:Serve as the foundation for generating graph embeddings, which are used in downstream machine learning tasks—such as clustering, classification, or link prediction—to enhance model performance.Semih Salihoğlu:LinkedInKuzu GitHubKuzu DocsNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Graph Databases00:18 Introducing Kuzu: A Modern Graph Database01:48 Use Cases and Applications of Kuzu03:03 Kuzu's Research Origins and Scalability06:18 Columnar Storage vs. Row-Oriented Storage10:27 Query Processing Techniques in Kuzu22:22 Compressed Sparse Row (CSR) Storage27:25 Vectorization in Graph Databases31:24 Optimizing Query Processors with Vectorization33:25 Common Wisdom in Graph Databases35:13 Introducing ASP Join in Kuzu35:55 Factorization and Efficient Query Processing39:49 Challenges and Solutions in Graph Databases45:26 Write Path Optimization in Kuzu54:10 Future Developments in Kuzu57:51 Key Takeaways and Final Thoughts
    --------  
    1:03:35
  • #043 Knowledge Graphs Won't Fix Bad Data
    Metadata is the foundation of any enterprise knowledge graph.By organizing both technical and business metadata, organizations create a “brain” that supports advanced applications like AI-driven data assistants.The goal is to achieve economies of scale—making data reusable, traceable, and ultimately more valuable.Juan Sequeda is a leading expert in enterprise knowledge graphs and metadata management. He has spent years solving the challenges of integrating diverse data sources into coherent, accessible knowledge graphs. As Principal Scientist at data.world, Juan provides concrete strategies for improving data quality, streamlining feature extraction, and enhancing model explainability. If you want to build AI systems on a solid data foundation—one that cuts through the noise and delivers reliable, high-performance insights—you need to listen to Juan’s proven methods and real-world examples.Terms like ontologies, taxonomies, and knowledge graphs aren’t new inventions. Ontologies and taxonomies have been studied for decades—even since ancient Greece. Google popularized “knowledge graphs” in 2012 by building on decades of semantic web research. Despite current buzz, these concepts build on established work.Traditionally, data lives in siloed applications—each with its own relational databases, ETL processes, and dashboards. When cross-application queries and consistent definitions become painful, organizations face metadata management challenges. The first step is to integrate technical metadata (table names, columns, code lineage) into a unified knowledge graph. Then, add business metadata by mapping business glossaries and definitions to that technical layer.A modern data catalog should:Integrate Multiple Sources: Automatically ingest metadata from databases, ETL tools (e.g., dbt, Fivetran), and BI tools.Bridge Technical and Business Views: Link technical definitions (e.g., table “CUST_123”) with business concepts (e.g., “Customer”).Enable Reuse and Governance: Support data discovery, impact analysis, and proper governance while facilitating reuse across teams.Practical Approaches & Use Cases:Start with a Clear Problem: Whether it’s reducing churn, improving operational efficiency, or meeting compliance needs, begin by solving a specific pain point.Iron Thread Method: Follow one query end-to-end—from identifying a business need to tracing it back to source systems—to gradually build and refine the graph.Automation vs. Manual Oversight: Technical metadata extraction is largely automated. For business definitions or text-based entity extraction (e.g., via LLMs), human oversight is key to ensuring accuracy and consistency.Technical Considerations:Entity vs. Property: If you need to attach additional details or reuse an element across contexts, model it as an entity (with a unique identifier). Otherwise, keep it as a simple property.Storage Options: The market offers various graph databases—Neo4j, Amazon Neptune, Cosmos DB, TigerGraph, Apache Jena (for RDF), etc. Future trends point toward multimodel systems that allow querying in SQL, Cypher, or SPARQL over the same underlying data.Juan Sequeda:LinkedIndata.worldSemantic Web for the Working OntologistDesigning and Building Enterprise Knowledge Graphs (before you buy, send Juan a message, he is happy to send you a copy)Catalog & Cocktails (Juan’s podcast)Nicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Knowledge Graphs 00:45 The Role of Metadata in AI 01:06 Building Knowledge Graphs: First Steps 01:42 Interview with Juan Sequira 02:04 Understanding Buzzwords: Ontologies, Taxonomies, and More 05:05 Challenges and Solutions in Data Management 08:04 Practical Applications of Knowledge Graphs 15:38 Governance and Data Engineering 34:42 Setting the Stage for Data-Driven Problem Solving 34:58 Understanding Consumer Needs and Data Challenges 35:33 Foundations and Advanced Capabilities in Data Management 36:01 The Role of AI and Metadata in Data Maturity 37:56 The Iron Thread Approach to Problem Solving 40:12 Constructing and Utilizing Knowledge Graphs 54:38 Trends and Future Directions in Knowledge Graphs 59:17 Practical Advice for Building Knowledge Graphs
    --------  
    1:10:59

Altri podcast di Tecnologia

Su How AI Is Built

Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
Sito web del podcast

Ascolta How AI Is Built, Il Disinformatico e molti altri podcast da tutto il mondo con l’applicazione di radio.it

Scarica l'app gratuita radio.it

  • Salva le radio e i podcast favoriti
  • Streaming via Wi-Fi o Bluetooth
  • Supporta Carplay & Android Auto
  • Molte altre funzioni dell'app