Multimodal Search Engine

This project is a Multi-Modal Search Engine developed using CLIP by OpenAI, with Flask API for backend and HTML/CSS for the frontend web application.

Introduction

This project provides a seamless web interface where users can input text queries, and the system retrieves relevant images based on the textual description based on CLIP architecture read the paper.

Take a look

Demo Video

This video demonstrates how to use our project's main feature.

How to use for your own images?

Sample data of 130 images is present in the file or
See the video or
Place your images in src/minidata
Run the notebook src/image-processor
Move the data in src/image_embeddings & the data in src/minidata to flaskapp/image_embeddings & flaskapp/static respectively (caution: transfer the data, not the directories)

Features

Multi-Modal Search: Users can input textual descriptions of images to retrieve relevant images.
Intuitive Web Interface: The frontend is built using React to provide a user-friendly experience.
Scalable Backend: Flask API serves as the backend, handling requests and interacting with the CLIP model.

Clone the repository:

git clone https://github.com/ahmedembeddedxx/multimodal-search-engine.git

Usage

Start the backend server:

cd flaskapp/
flask run

Access the web application in your browser at http://127.0.0.1:5000/.

Stacks

OpenAI for developing CLIP.
Flask for the backend framework.

Future Expectences

Shift the app to ReactJs
Use ImageBind by MetaAI
More accurate modal evaluation
Integrate Audio & Video Functionality

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
FlaskApp		FlaskApp
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal Search Engine

Introduction

Take a look

Demo Video

How to use for your own images?

Features

Usage

Stacks

Future Expectences

About

Uh oh!

Releases

Packages

Languages

License

ahmedembeddedxx/multimodal-search-engine

Folders and files

Latest commit

History

Repository files navigation

Multimodal Search Engine

Introduction

Take a look

Demo Video

How to use for your own images?

Features

Usage

Stacks

Future Expectences

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages