-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vector_search function for pipeline aggregation #30
Conversation
You can only raise issues from one repo every day! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feedback from Senior Dev Bot
|
||
def update_one(self, collection, query, update): | ||
return self.db[collection].update_one(query, update) | ||
|
||
# add a function for pipeline aggregation vector search | ||
def vector_search(self, collection, embedding): | ||
|
||
result = self.db[collection].aggregate([ | ||
{ | ||
"$vectorSearch": { | ||
"index": "vector_index", | ||
"path": "face_embedding", | ||
"queryVector": embedding, | ||
"numCandidates": 20, | ||
"limit": 20 | ||
} | ||
}, { | ||
'$project': { | ||
'_id': 0, | ||
'Name': 1, | ||
'Image': 1, | ||
'score': { | ||
'$meta': 'vectorSearchScore' | ||
} | ||
} | ||
} | ||
]) | ||
result_arr = [i for i in result] | ||
return result_arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting the query and projection into variables for better readability and maintainability. This practice enhances code clarity and simplifies future modifications.
def vector_search(self, collection, embedding):
query = {
"$vectorSearch": {
"index": "vector_index",
"path": "face_embedding",
"queryVector": embedding,
"numCandidates": 20,
"limit": 20
}
}
projection = {
'$project': {
'_id': 0,
'Name': 1,
'Image': 1,
'score': {'$meta': 'vectorSearchScore'}
}
}
result = self.db[collection].aggregate([query, projection])
return [i for i in result]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
CHANGELOG.md
Outdated
- Resolved various bugs and issues identified during the testing process. | ||
|
||
### Removed | ||
- Removed deprecated code and unused dependencies from the project. No newline at end of file | ||
- Removed deprecated code and unused dependencies from the project. | ||
|
||
## [0.1.4] - 2024-03-16 - 23:00 | ||
|
||
### Added | ||
- Implemented a new `recognize_Face` endpoint in [`route.py`](API/route.py). This endpoint accepts a base64 string as input, converts it into embeddings, and performs a vector search query on the MongoDB Atlas database. Changes made by @Devasy23. | ||
- Added a new `vector_search` function in [`database.py`](API/database.py). This function performs a vector similarity search on the MongoDB Atlas database using Euclidean distance as the similarity measure. Changes made by @Devasy23. | ||
- Updated [`index.ipynb`](index.ipynb) to include examples and usage of the new `recognize_Face` endpoint and `vector_search` function. Changes made by @Devasy23. | ||
|
||
### Changed | ||
- Updated the `Database` class in [`database.py`](API/database.py) to include the new `vector_search` function. Changes made by @Devasy23. | ||
|
||
### Fixed | ||
- Resolved various bugs and issues identified during the implementation and testing of the new features. Fixes made by @Devasy23. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on extending the project's functionality with the new recognize_Face
endpoint and vector_search
function. Here's a little feedback:
- Code Reusability and Clarity: For the
vector_search
function, consider defining a separate utility for Euclidean distance calculation if not done already. This promotes code reuse, especially if other parts of the system perform similar computations.
# Suggested improvement for database.py
def euclidean_distance(vector1, vector2):
return np.linalg.norm(vector1-vector2)
def vector_search(query_vector):
# Use euclidean_distance in the search algorithm.
-
Documentation and Examples: Ensure the updated
index.ipynb
has clear, concise examples that not only illustrate how to use the new features but also highlight any potential edge cases or limitations. -
Consistent Formatting: The addition of a newline at the end of files is a good practice; make sure this is consistently applied across all edited files.
Remember to consider unit tests for the new functionalities if not already included, ensuring robustness and future maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
API/route.py
Outdated
client.find_one_and_delete(collection, {"EmployeeCode": EmployeeCode}) | ||
|
||
return {"Message": "Successfully Deleted"} | ||
|
||
|
||
@router.post("/recognize_face", response_class=Response) | ||
async def recognize_face(Face: UploadFile = File(...)): | ||
""" | ||
Recognize a face from the provided image. | ||
|
||
Args: | ||
Face (UploadFile): The image file to be recognized. | ||
|
||
Returns: | ||
Response: A response object containing the recognized employee information in JSON format. | ||
|
||
Raises: | ||
HTTPException: If an internal server error occurs. | ||
""" | ||
logging.info("Recognizing Face") | ||
try: | ||
img_data = await Face.read() | ||
with open("temp.png", "wb") as f: | ||
f.write(img_data) | ||
|
||
embedding = DeepFace.represent(img_path="temp.png", model_name="Facenet") | ||
result = client2.vector_search(collection2, embedding[0]['embedding']) | ||
logging.info(f"Result: {result}") | ||
os.remove("temp.png") | ||
except Exception as e: | ||
logging.error(f"Error: {e}") | ||
os.remove("temp.png") | ||
raise HTTPException(status_code=500, detail="Internal server error") | ||
return Response( | ||
content=bytes(json.dumps(result[0], default=str), "utf-8"), | ||
media_type="application/json", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Temporary File Creation: Directly writing the uploaded image to a file (
temp.png
) can lead to concurrency issues and security concerns. Use a temporary file with a context manager to ensure it gets cleaned up properly, even in case of errors.
from tempfile import NamedTemporaryFile
async def recognize_face(Face: UploadFile = File(...)):
logging.info("Recognizing Face")
try:
img_data = await Face.read()
with NamedTemporaryFile(delete=True, suffix=".png") as temp_file:
temp_file.write(img_data)
temp_file.flush()
embedding = DeepFace.represent(img_path=temp_file.name, model_name="Facenet")
result = client2.vector_search(collection2, embedding[0]['embedding'])
except Exception as e:
logging.error(f"Error: {e}")
raise HTTPException(status_code=500, detail="Internal server error")
-
Error Handling: Current error handling might catch too broad of a range of exceptions, potentially swallowing unexpected errors and making debugging difficult. Be specific about which errors you catch or ensure to re-raise unexpected ones.
-
File Reading Directly in Endpoint: It's a good practice to separate out logic into service layers or utility functions. This aids in keeping your endpoint functions clean and more maintainable.
-
Use Environment Variables for file paths or model names to make the application more flexible and secure.
-
DRY Principle: Consider whether the pattern of removing a file is repeated elsewhere in your code. If so, abstract the cleanup logic into a utility function.
Overall, ensure every aspect adheres to scalability, security, and maintainability principles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes has been made 🎉
99f5c9d
to
3feddf0
Compare
fa33e0b
to
38e4d08
Compare
a28eb2a
to
a86697a
Compare
Quality Gate passedIssues Measures |
Quality Gate failedFailed conditions |
This pull request adds a new function called
vector_search
to theDatabase
class indatabase.py
. Thevector_search
function performs a pipeline aggregation vector search on the MongoDB Atlas database using the provided embedding. It returns a list of results with the name, image, and score of the closest matches. This functionality is useful for performing similarity searches based on face embeddings.