Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d381b0f
update use case
nitpicker55555 Apr 29, 2025
ef75b1d
update use case
nitpicker55555 Apr 29, 2025
04ebf68
Update README.md
nitpicker55555 Apr 29, 2025
54d5cca
Update README.md
nitpicker55555 Apr 29, 2025
f1a6201
update use case
nitpicker55555 Apr 30, 2025
f4316cc
update use case
nitpicker55555 Apr 30, 2025
3c63a7c
fixed bugs
nitpicker55555 Apr 30, 2025
eed4cf1
update workflow
nitpicker55555 Apr 30, 2025
594bfc5
Update README.md
nitpicker55555 Apr 30, 2025
df1ee1c
update use case
nitpicker55555 Apr 30, 2025
7a477e7
update template
nitpicker55555 May 1, 2025
3ef04c1
update example
nitpicker55555 May 1, 2025
b9ede45
update workflow
nitpicker55555 May 1, 2025
1da27bd
Update README.md
nitpicker55555 May 1, 2025
2aa418d
Update run_profile_generation.py
nitpicker55555 May 1, 2025
c5ca5b3
update example
nitpicker55555 May 2, 2025
f386d9a
update template
nitpicker55555 May 2, 2025
332644d
add publications
nitpicker55555 May 4, 2025
7dec7a7
update template
nitpicker55555 May 4, 2025
4545355
update
nitpicker55555 May 10, 2025
1b342a3
Merge branch 'camel-ai:main' into profile-generation
nitpicker55555 Jun 17, 2025
5cc01cb
update
nitpicker55555 Jun 17, 2025
2722da8
update
nitpicker55555 Jun 17, 2025
56cc2b2
update
nitpicker55555 Jun 17, 2025
c20eebf
update
nitpicker55555 Jun 25, 2025
091edb8
update
nitpicker55555 Jun 25, 2025
67cdf40
update
nitpicker55555 Jun 25, 2025
3b3a121
update
nitpicker55555 Jun 25, 2025
d8ec2d1
update
nitpicker55555 Jun 26, 2025
caf9e65
update
nitpicker55555 Jun 26, 2025
8879785
update
nitpicker55555 Jun 30, 2025
82dd9c3
update
nitpicker55555 Jul 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,6 @@ coverage.xml
owl/camel/types/__pycache__/
owl/camel/__pycache__/
owl/camel/utils/__pycache_/
/community_usecase/Profile_generation/Ahmed Eltawil _ Computer, Electrical and Mathematical Sciences and Engineering_files/
/community_usecase/Profile_generation/my-user-data/
/community_usecase/Profile_generation/my-user-data-dir/
70 changes: 70 additions & 0 deletions community_usecase/Profile_generation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Scholar Profile Generation

This project provides an end-to-end pipeline for automatically generating a polished HTML profile for an academic **scholar** starting from their Google Scholar (or similar) profile link.

The pipeline is powered by [CAMEL-AI](https://github.com/camel-ai/camel) agents and Playwright-based headless browsing and is exposed both as a **CLI tool** and as a **Flask REST API** with an accompanying (very) minimal front-end.

---

## Key Features

* **Smart link discovery** – Uses the Exa search engine, via `camel-ai`'s `SearchToolkit`, to find high-quality links related to the scholar (homepage, Google Scholar, institutional pages, social media, etc.).
* **Headless crawling & markdown extraction** – `BrowserNonVisualToolkit` (Playwright under the hood) visits each link, extracts the main content, and stores it as individual markdown files.
* **Asynchronous Web Agent** – Link crawling and content extraction are orchestrated with **asyncio**, allowing multiple pages to be processed **concurrently** and cutting the end-to-end runtime significantly compared with sequential scraping.
* **Automatic HTML generation** – Aggregated markdown is parsed with an LLM to fill an HTML template (`template.html`) resulting in a share-ready profile page.
* **Progress tracking** – Long-running jobs execute in a background thread. Progress can be polled via `/api/progress/<job_id>`.
* **Dual usage modes** –
* **CLI**: `python app.py --url <scholar_url>`
* **Web**: `python app.py` (serves on `http://localhost:5000`)

---

## Project Structure

```
community_usecase/Profile_generation/
├── app.py # Main entry point (CLI & Flask server)
├── template.html # Bootstrap-based HTML template filled by the pipeline
├── templates/ # Front-end assets (index.html, CSS, JS)
└── workflow.png # Architecture diagram
```

---

## Quick Start

1. **Install Python dependencies**

```bash
python -m venv .venv
source .venv/bin/activate # On Windows use: .venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt
```

2. **Set environment variables**

The pipeline relies on LLMs and Exa search.

```bash
export OPENAI_API_KEY=<your-openai-key>
export EXA_API_KEY=<your-exa-key>
```

3. **Run the web server**

```bash
python app.py
# Open http://localhost:5000 in your browser
```

Or generate a profile once via CLI:

```bash
python app.py --url "https://scholar.google.com/citations?user=<SCHOLAR_ID>"
```

The resulting HTML file will be placed in the project directory (e.g. `profile.html`).

---

Loading