Previously, I wrote code to scrape the academic affairs system, and I don't want to touch the front end anymore, so I put it on GitHub for anyone who needs it. It's written very poorly, please bear with me, after all, I just started learning during the winter vacation 😂.
Project Overview#
The Student Information System is an application based on a microservices architecture, designed to provide a complete solution for student information management. The system retrieves data from the school's academic affairs system using web scraping technology and provides a unified API interface for front-end or other system calls.
This system adopts a distributed microservices architecture, with each service responsible for a specific functional domain, providing services externally through an API gateway. The system uses JWT for authentication and authorization, ensuring data security and user privacy.
System Architecture#
The system consists of the following six main microservices:
1. Authentication Service (AuthService)#
Responsible for user authentication and authorization management, it is the core component of system security.
Main Functions:
-
User login verification
-
JWT token generation and management
-
Token validation and refresh
-
Browser automation login implementation
Core Components:
-
LoginService
: Handles user login requests and generates JWT tokens -
BrowserManager
: Manages Playwright browser instances to implement automated login
2. Common Library (Common)#
Contains components shared by all services to ensure code reuse and consistency.
Main Content:
-
Data Models (Models): Defines the data structures used in the system
-
Middleware: Such as API security middleware
-
Filters: Such as year limit filters
-
Common Services: Such as login service, parsing service, etc.
-
Utility Classes: Provides common functionalities
Core Components:
-
AuthModels
: Defines data models related to authentication -
StudentInfoParser
: Parses student information HTML -
GradeParser
: Parses grade information HTML -
ScheduleParserService
: Parses schedule information HTML
3. API Gateway (Gateway)#
As the unified entry point of the system, it is responsible for request routing and load balancing.
Main Functions:
-
Request routing: Forwards requests to the corresponding microservices
-
API aggregation: Combines results from multiple services
-
Cross-Origin Resource Sharing (CORS): Handles front-end cross-origin requests
-
Request rate limiting: Prevents service overload
Technical Implementation:
-
Uses YARP as a reverse proxy
-
Integrates JWT authentication
-
Configurable routing rules
4. Grade Service (GradeService)#
Responsible for retrieving and managing student grade information.
Main Functions:
-
Query student historical grades
-
Grade statistics and analysis
-
Grade data scraping and parsing
Core Components:
-
GradeService
: Handles grade query requests -
GradeParser
: Parses grade HTML data
5. Schedule Service (ScheduleService)#
Responsible for retrieving and managing student schedule information.
Main Functions:
-
Query current semester schedule
-
Query historical schedules by academic year and semester
-
Schedule data scraping and parsing
-
Schedule formatted output
Core Components:
-
CourseScheduleService
: Handles schedule query requests -
ScheduleParserService
: Parses schedule HTML data
6. Student Information Service (StudentService)#
Responsible for retrieving and managing basic student information.
Main Functions:
-
Student status information query
-
Personal information query
-
Contact information query
-
Student information scraping and parsing
Core Components:
-
StudentInfoService
: Handles student information query requests -
StudentInfoCrawlerService
: Scrapes student information -
StudentInfoParser
: Parses student information HTML data
Technology Stack#
-
Backend Framework: ASP.NET Core 8.0
-
Authentication and Authorization: JWT (JSON Web Token)
-
API Documentation: Swagger
-
Browser Automation: Microsoft Playwright
-
HTML Parsing: HtmlAgilityPack
-
API Gateway: YARP
-
Logging: Built-in .NET logging system
-
Dependency Injection: Built-in .NET DI container
-
HTTP Client: HttpClient
System Workflow#
-
User Authentication Process:
-
The user sends a login request to AuthService through the Gateway
-
AuthService uses Playwright to simulate browser login to the academic affairs system
-
Upon successful verification, a JWT token is generated and returned to the user
-
The user carries the JWT token in subsequent requests to access other services
-
-
Data Retrieval Process:
-
The user requests specific services (grades/schedule/student information) with the JWT token
-
The service verifies the token's validity
-
The service uses Playwright to scrape data from the academic affairs system
-
Uses dedicated parsers to parse HTML data
-
Returns the parsed data to the user
-
-
Inter-Service Communication:
-
Services communicate via HTTP APIs
-
Shared data models ensure data consistency
-
Dependency injection is used to manage service dependencies
-
Environment Requirements#
-
.NET SDK: 8.0
-
Operating System: Windows/macOS/Linux (any system that supports .NET 8.0)
-
Memory: At least 2GB RAM (recommended 4GB or more)
-
Storage: At least 1GB of available space
-
Network: Stable internet connection
Installation and Configuration#
Prerequisites#
- Install .NET 8.0 SDK
# Check if installed
dotnet --version
- Install Playwright
dotnet tool install --global Microsoft.Playwright.CLI
playwright install
- Clone the repository
git clone https://github.com/ExXTong/StudentInfoSystem
cd StudentInfoSystem
Configure Services#
Each service has its own appsettings.json configuration file, which needs to be configured according to the actual environment:
JWT Configuration
{
"Jwt": {
"Secret": "your_secret_key",
"Expiration": "12:00:00"
}
}
Build and Run#
Build the Project#
dotnet restore
dotnet build
Run Services#
Each service needs to be started in dependency order, which requires opening multiple terminals:
- Start the Authentication Service
cd StudentInfoSystem.AuthService
dotnet run
- Start the Student Information Service
cd StudentInfoSystem.StudentService
dotnet run
- Start the Grade Service
cd StudentInfoSystem.GradeService
dotnet run
- Start the Schedule Service
cd StudentInfoSystem.ScheduleService
dotnet run
- Start the API Gateway
cd StudentInfoSystem.Gateway
dotnet run
API Documentation#
Once the services are started, you can access the Swagger documentation for each service at the following URLs:
-
Authentication Service: http://localhost:5001/swagger
-
Student Information Service: http://localhost:5002/swagger
-
Grade Service: http://localhost:5003/swagger
-
Schedule Service: http://localhost:5004/swagger
-
API Gateway: http://localhost:5000/swagger
Remember to use https and close unnecessary ports.