MRRA LogoMRRA
Getting Started

Quick Start

Build agents and execute three prediction tasks

Quick Start

This section provides a minimal runnable example. The process includes: trajectory normalization → activity extraction → activity purpose assignment (using LLM concurrency) → graph construction → graph retrieval → multi-agent reflection → executing three types of predictions.

Environment Setup

Before starting, ensure your environment is properly configured:

  • Python Version: >= 3.10 (virtual environment recommended)
  • Basic Dependencies: pip install -e .
  • Development: pip install -e .[dev]
  • MCP Extension: pip install langchain-mcp-adapters mcp anyio httpx-sse

Key Package Explanations

  • langchain-openai: Creates OpenAI-compatible chat model clients
  • networkx: Graph structure implementation
  • pandas: Data processing and time parsing

Complete Example

Here's a complete working example that demonstrates the full MRRA workflow:

from mrra.data.trajectory import TrajectoryBatch
from mrra.data.activity import ActivityExtractor
from mrra.analysis.activity_purpose import ActivityPurposeAssigner
from mrra.graph.mobility_graph import MobilityGraph, GraphConfig
from mrra.retriever.graph_rag import GraphRAGGenerate
from mrra.graph.pattern import PatternGenerate
from mrra.agents.builder import build_mrra_agent
from mrra.agents.subagents import make_llm
from mrra.persist.cache import CacheManager, compute_tb_hash

import pandas as pd

# 1) Load trajectory data (must contain columns: user_id, timestamp, latitude, longitude)
df = pd.read_csv("your_traj.csv")
tb = TrajectoryBatch(df)  # Auto-parse time, add derived columns like hour/dow

# 2) Activity extraction (radius/grid method) and LLM concurrent refinement of "activity purposes"
ext_cfg = dict(
    method="radius", 
    radius_m=300, 
    min_dwell_minutes=30, 
    max_gap_minutes=90, 
    grid_size_m=200
)
acts = ActivityExtractor(tb, **ext_cfg).extract()

llm_cfg = dict(
    provider='openai-compatible', 
    model='qwen-plus', 
    base_url='https://dashscope.aliyuncs.com/compatible-mode/v1', 
    api_key='YOUR_API_KEY', 
    temperature=0.2
)
llm = make_llm(**llm_cfg)  # Note: This is a callable client, not a dictionary
acts = ActivityPurposeAssigner(tb, llm=llm, concurrency=8).assign(acts)

# 3) Optional: Cache results for reuse
cm = CacheManager()
tb_hash = compute_tb_hash(tb)
cm.save_activities(tb_hash, "default", acts)

# 4) Build Mobility Graph (inject "purpose-assigned" activities to avoid re-calling LLM during graph construction)
cfg = GraphConfig(grid_size_m=200, min_dwell_minutes=5, use_activities=True)
mg = MobilityGraph(tb, cfg, activities=acts, assume_purposes_assigned=True)

# 5) Graph retriever (GraphRAG), supports purpose as seed
retriever = GraphRAGGenerate(tb=tb, mobility_graph=mg)

# 6) Multi-agent reflection agent (two sub-agents + confidence voting aggregation)
reflection_cfg = dict(
    max_round=1,
    subAgents=[
        {"name": "temporal", "prompt": "Select the most likely location id from Options (selection), do not output coordinates."},
        {"name": "spatial",  "prompt": "Select the most likely location id from Options (selection), do not output coordinates."},
    ],
    aggregator="confidence_weighted_voting",
)
agent = build_mrra_agent(llm=llm_cfg, retriever=retriever, reflection=reflection_cfg)

# 7) Execute three prediction tasks
user_id = tb.users()[0]
last_ts = tb.for_user(user_id).iloc[-1]["timestamp_local"].strftime("%Y-%m-%d %H:%M:%S")

for task, extra in (
    ("next_position", {"t": last_ts}),
    ("future_position", {"t": last_ts}),
    ("full_day_traj", {"date": tb.for_user(user_id).iloc[-1]["timestamp_local"].strftime("%Y-%m-%d")}),
):
    payload = {"task": task, "user_id": user_id}
    payload.update(extra)
    res = agent.invoke(payload)
    print(task, res)

Script Demonstration

The script python scripts/verify_geolife.py demonstrates the complete workflow above, with additional features:

  • LLM concurrent inference of "activity purposes" with caching (.mrra_cache/)
  • Reusing "purpose-assigned" activities during graph construction (avoiding secondary LLM calls)
  • Demonstration of using purpose as retrieval seed (purpose=dining)

Data Requirements

Your trajectory data should include these required columns:

user_id      - User identifier
timestamp    - Timestamp (various formats supported)
latitude     - Latitude coordinate
longitude    - Longitude coordinate
poi_id       - Point of interest ID
activity_type - Activity type annotation
speed        - Movement speed
bearing      - Direction of movement
user_id,timestamp,latitude,longitude,poi_id
user_1,2023-01-01 09:00:00,31.2304,121.4737,poi_home
user_1,2023-01-01 12:00:00,31.2404,121.4837,poi_office
user_1,2023-01-01 18:30:00,31.2304,121.4737,poi_home

For Geolife dataset users: Place original .plt files in scripts/Data/<user>/Trajectory/ and run python scripts/verify_geolife.py. You can specify the user directory with environment variable GEOLIFE_USER=<user>.

Next Steps

Once you have completed the quick start: