Skip to content
View Talktolisten's full-sized avatar
Block or Report

Block or report Talktolisten

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Talktolisten/README.md

Discover a new world of interaction with 'Talk To Listen' on your mobile phone – where your voice brings characters to life! Engage in seamless conversations with a diverse universe of characters, each boasting their own unique personality and voice.

Check out website for download links and previews.

Features

  • Voice-Activated Conversations: Interact with characters using your voice, you don't have to type anything, or touch the screen, just talk and listen.
  • Diverse Characters: Engage with a wide range of characters, each with their own unique personality and voice.
  • Real-Time Interaction: Experience real-time responses to your voice commands.
  • Group Conversations: Talk with multiple characters at once.
  • Immersive Storytelling: Dive into a world of storytelling and adventure.
  • Customizable Characters: Personalize your characters with unique traits and characteristics.
  • Cross-Platform Compatibility: Access 'Talk To Listen' on any device, anytime, anywhere.
  • Multi-Language Support: Communicate with characters in multiple languages.
  • Safe and Secure: Enjoy a safe and secure environment for conversations, generating content that is safe for all ages.

Tech Stack

Architecture

Overview

The application's architecture is distributed, with several components interacting to provide the overall functionality. The front-end is built with Expo React Native, Redux, Firebase, and Axios, while the back-end uses FastAPI, SQLAlchemy, Firebase, Docker, and other technologies. The data is stored in a PostgreSQL database, and the application uses GitHub Actions, Docker, and Azure services for continuous integration and deployment. It also integrates with third-party APIs for features like voice live streaming and text to speech.

1. Infrastructure

  • Tech: Azure Virtual Machine, Azure Application Gateway, Azure Load Balancer, Azure Virtual Network, Azure Network Security Group.

  • Azure Application Gateway: A web traffic load balancer that manage traffic to servers. It provides SSL termination, which offloads the encryption and decryption of SSL traffic from web servers, and health probes, which automatically remove unhealthy instances from the rotation. (This service is expensive, so it is likely to be removed in the future.)
  • Azure Load Balancer: Distributes incoming network traffic across multiple virtual machines to ensure high availability and fault tolerance.
  • Azure Virtual Network: Connects virtual machines to each other and to other Azure services securely. The virtual machined are only accessible through this internal load balancer.
  • Azure Network Security Group: Provides network security by filtering inbound and outbound traffic to the virtual machines.

2. Front-end

  • Tech: Expo React Native (JavaScript), Redux, Firebase, Axios, Expo Update.
  • GitHub

  • Expo React Native (JavaScript): A cross-platform framework for building mobile applications using JavaScript and React. It allows developers to write code once and deploy it on both iOS and Android platforms.
  • Expo Update: Service that allows over-the-air updates for Talk To Listen. The app can be updated immediately without going through the app store. Any bugs or issues can be fixed quickly and efficiently.
  • Redux: A state management library that helps manage the application's state in a predictable way.
  • Firebase: Provide secure authentication for users and store data in real-time.
  • Axios: A promise-based HTTP client that makes it easy to send asynchronous HTTP requests to the backend server.

3. Back-end

  • Tech: FastAPI(Python), SQLAlchemy, Firebase, Docker, Nginx, Gunicorn, Alembic, Pydantic, Pytest, RESTful APIs, Azure Virtual Machine.
  • GitHub
  • API Documentation

  • FastAPI( Python): Modern, fast (high-performance) Python framework for building APIs.
  • RESTful APIs: The backend services expose RESTful APIs that the frontend can consume to interact with the application.
  • Azure Virtual Machines: Multiple virtual machines are used to host the backend services. The virtual machines are duplicated to ensure high availability and fault tolerance. Talk To Listen uses Azure Virtual Machines to host the backend services, and always has more than one instance running to ensure that the application is always available.
  • SQLAlchemy: A Python SQL toolkit and Object-Relational Mapping (ORM) library that provides a set of high-level APIs for working with databases.
  • Firebase: Provides secure authentication with frontend and backend services. Only allowed users can access the application.
  • Docker: The backend services are containerized using Docker to ensure consistency and portability across different environments.
  • PyTest: All backend services are tested using PyTest to ensure that they work as expected.
  • SSL/TLS: The backend services use SSL/TLS to encrypt data in transit and ensure secure communication between the frontend and backend.

4. Database

  • Tech: PostgreSQL, Azure Database for PostgreSQL, Azure Blob Storage, Azure CDN (Content Delivery Network).

  • PostgreSQL: Talk To Listen uses PostgreSQL as the primary database to store user data, character information, and other application data.
  • Azure Database for PostgreSQL: A fully managed database service that provides high availability, scalability, and security for PostgreSQL databases.
  • Azure Blob Storage: Used to store large amounts of unstructured data, such as images, audio files, and other media files.
  • Azure CDN (Content Delivery Network): The Azure CDN is used to cache static content, such as images and media files, to improve performance and reduce latency for users.

Design

  • The database schema is designed to store user data, character information, and other application data in a structured and efficient manner.
  • Entity-Relationship Diagram: The database schema is designed using an Entity-Relationship Diagram (ERD) to visualize the relationships between different entities and attributes.
  • UML Diagram: The database schema is designed using a Unified Modeling Language (UML) diagram to visualize the classes, attributes, and relationships between different entities.

5. Continuous Integration/Continuous Deployment

  • Tech: Git/GitHub, GitHub Actions, Docker, Azure Virtual Machine

  • Git/GitHub: The source code is stored in GitHub repositories for version control and collaboration.
  • GitHub Actions: Used for continuous integration and continuous deployment (CI/CD) to automate the build, test, and deployment processes for the backend services.
  • Docker: The backend services are containerized using Docker to ensure consistency and portability across different environments.
  • Azure Virtual Machine: The backend services are deployed on Azure Virtual Machines using Docker containers.

6. Security

User's data and privacy are of utmost importance. The application uses various security measures to ensure that user data is protected and secure.

  • SSL/TLS: The backend services use SSL/TLS to encrypt data in transit and ensure secure communication between the frontend and backend.
  • Firebase Authentication: Provides secure authentication for users and ensures that only authorized users can access the application.
  • Azure Network Security Group: Filters inbound and outbound traffic to the virtual machines to provide network security.
  • Delete User Data: Users can delete their account at any time, and all data is deleted from the database and storage.

7. Third-party APIs

Upcoming Features

  • Voice cloning Support: I'm testing an open-source voice cloning model to allow users to clone their voice and use it in the app. This will make the conversations more personal and engaging.
  • Lock Screen Support: I'm working on adding lock screen support, which will allow users to interact with the app even when their screen is locked. This feature will enhance the app's usability on the go, save battery life, and provide convenience for users who use earphones.

Developer

  • Hieu "Leo" Nguyen.
  • Website and GitHub
  • The code is only for showcasing purposes and under the Apache 2.0 License.

Popular repositories

  1. Talktolisten Talktolisten Public

    Talk To Listen - AI - powered Voice Chat and Texting Platform. Talk To Life-Like Characters

  2. talktolisten-backend talktolisten-backend Public

    Talk To Listen Backend

    Python

  3. talktolisten-frontend talktolisten-frontend Public

    Talk To Listen Frontend

    JavaScript

  4. talktolisten-LLM-dev talktolisten-LLM-dev Public

    Talk To Listen LLM Creation

    Jupyter Notebook

  5. talktolisten-LLM-prod talktolisten-LLM-prod Public

    LLM Deployment

    Python