MRRA LogoMRRA
Core Concepts

Activity Patterns

Pattern summarization methods and usage for user profiling

Activity Patterns

PatternGenerate uses heuristics to quickly output "long-term and short-term patterns" and basic user profiles. This provides essential context for mobility prediction and user behavior understanding.

Pattern Types

Profile Patterns

  • profile.home: Rough home location identification through nighttime hotspots
  • profile.work: Work location identification through weekday daytime hotspots

Temporal Patterns

  • short: Summary of "adjacent location changes" from the most recent 10 observations
  • long: Overview based on hourly/weekly preferences in one or two sentences

Basic Usage

Here's how to generate and cache activity patterns:

from mrra.graph.pattern import PatternGenerate
from mrra.persist.cache import CacheManager, compute_tb_hash

# Initialize pattern generator
pat = PatternGenerate(tb)

# Generate patterns for a specific user
user_id = tb.users()[0]
patterns = pat.long_short_patterns(user_id)

# Cache the results
cm = CacheManager()
tb_hash = compute_tb_hash(tb)
cm.save_json(tb_hash, f"patterns_{user_id}", patterns, kind="patterns")

print("Generated patterns:", patterns)

Pattern Structure

The generated patterns follow this structure:

{
    "user_id": "user_123",
    "profile": {
        "home": {
            "location": "g_1234_5678",  # Grid location
            "confidence": 0.85,
            "coordinates": [31.2304, 121.4737],
            "visit_count": 45,
            "total_minutes": 2880
        },
        "work": {
            "location": "g_2345_6789",
            "confidence": 0.92,
            "coordinates": [31.2404, 121.4837],
            "visit_count": 22,
            "total_minutes": 1760
        }
    },
    "short": "Recent movement: home→office→restaurant→office→home. Typical commuting pattern with lunch break.",
    "long": "User shows regular 9-5 work schedule with consistent lunch breaks around 12-1pm. Weekend activity concentrated around residential area.",
    "generation_time": "2024-01-01T12:00:00Z",
    "confidence": 0.78
}

Advanced Pattern Generation

Custom Pattern Extraction

You can extend pattern generation with custom logic:

class ExtendedPatternGenerate(PatternGenerate):
    def __init__(self, tb, **kwargs):
        super().__init__(tb, **kwargs)
    
    def generate_weekend_patterns(self, user_id):
        """Generate weekend-specific patterns"""
        user_data = self.tb.for_user(user_id)
        weekend_data = user_data[user_data['dow'].isin([5, 6])]  # Sat, Sun
        
        if len(weekend_data) < 5:
            return {"weekend": "Insufficient weekend data"}
        
        # Analyze weekend hotspots
        weekend_locations = weekend_data.groupby(['grid_y', 'grid_x']).agg({
            'timestamp_local': 'count',
            'latitude': 'mean',
            'longitude': 'mean'
        }).reset_index()
        
        top_weekend_spot = weekend_locations.loc[weekend_locations['timestamp_local'].idxmax()]
        
        return {
            "weekend": {
                "primary_location": f"g_{top_weekend_spot['grid_y']}_{top_weekend_spot['grid_x']}",
                "coordinates": [top_weekend_spot['latitude'], top_weekend_spot['longitude']],
                "visit_frequency": top_weekend_spot['timestamp_local'],
                "pattern": "Weekend activity concentrated in residential/leisure areas"
            }
        }
    
    def generate_hourly_preferences(self, user_id):
        """Generate detailed hourly activity preferences"""
        user_data = self.tb.for_user(user_id)
        
        hourly_stats = user_data.groupby('hour').agg({
            'latitude': ['mean', 'std'],
            'longitude': ['mean', 'std'],
            'timestamp_local': 'count'
        }).round(4)
        
        # Identify peak activity hours
        activity_counts = user_data['hour'].value_counts().sort_index()
        peak_hours = activity_counts.nlargest(3).index.tolist()
        
        return {
            "hourly_preferences": {
                "peak_hours": peak_hours,
                "activity_distribution": activity_counts.to_dict(),
                "mobility_variance": {
                    f"hour_{h}": {
                        "lat_std": hourly_stats.loc[h, ('latitude', 'std')],
                        "lon_std": hourly_stats.loc[h, ('longitude', 'std')]
                    } for h in peak_hours
                }
            }
        }

Pattern-Based User Clustering

Use patterns for user segmentation and clustering:

import pandas as pd
from sklearn.cluster import KMeans
import numpy as np

def cluster_users_by_patterns(pattern_dict_list):
    """Cluster users based on their mobility patterns"""
    
    features = []
    user_ids = []
    
    for patterns in pattern_dict_list:
        if 'profile' not in patterns:
            continue
            
        user_ids.append(patterns['user_id'])
        
        # Extract numerical features
        feature_vector = [
            patterns['profile'].get('home', {}).get('confidence', 0),
            patterns['profile'].get('work', {}).get('confidence', 0),
            patterns['profile'].get('home', {}).get('visit_count', 0),
            patterns['profile'].get('work', {}).get('visit_count', 0),
            patterns.get('confidence', 0),
            len(patterns.get('short', '')),  # Pattern complexity
        ]
        features.append(feature_vector)
    
    # Perform clustering
    features_array = np.array(features)
    kmeans = KMeans(n_clusters=3, random_state=42)
    clusters = kmeans.fit_predict(features_array)
    
    # Create cluster assignments
    cluster_assignments = {}
    for i, user_id in enumerate(user_ids):
        cluster_assignments[user_id] = {
            'cluster': int(clusters[i]),
            'features': features[i]
        }
    
    return cluster_assignments, kmeans

Integration with Agent Systems

Activity patterns are particularly useful as initial "background knowledge" for agents:

Agent Context Integration

def build_agent_with_patterns(llm_cfg, retriever, patterns, reflection_cfg):
    """Build agent with pattern-based context"""
    
    # Enhance system prompt with pattern information
    pattern_context = f"""
    User Profile Context:
    - Home location confidence: {patterns.get('profile', {}).get('home', {}).get('confidence', 0):.2f}
    - Work location confidence: {patterns.get('profile', {}).get('work', {}).get('confidence', 0):.2f}
    - Recent pattern: {patterns.get('short', 'No recent pattern available')}
    - Long-term behavior: {patterns.get('long', 'No long-term pattern available')}
    
    Use this context to inform your predictions and reasoning.
    """
    
    # Add context to sub-agent prompts
    enhanced_subagents = []
    for subagent in reflection_cfg['subAgents']:
        enhanced_prompt = f"{pattern_context}\n\n{subagent['prompt']}"
        enhanced_subagents.append({
            **subagent,
            'prompt': enhanced_prompt
        })
    
    enhanced_reflection_cfg = {
        **reflection_cfg,
        'subAgents': enhanced_subagents
    }
    
    return build_mrra_agent(
        llm=llm_cfg, 
        retriever=retriever, 
        reflection=enhanced_reflection_cfg
    )

Dynamic Pattern Updates

For long-running systems, implement dynamic pattern updates:

class DynamicPatternManager:
    def __init__(self, tb, update_interval_hours=24):
        self.tb = tb
        self.pattern_generator = PatternGenerate(tb)
        self.update_interval = update_interval_hours
        self.last_update = {}
        self.cached_patterns = {}
    
    def get_patterns(self, user_id, force_update=False):
        """Get patterns with automatic updates"""
        from datetime import datetime, timedelta
        
        now = datetime.now()
        
        # Check if update needed
        if (user_id not in self.last_update or 
            now - self.last_update[user_id] > timedelta(hours=self.update_interval) or
            force_update):
            
            # Generate fresh patterns
            patterns = self.pattern_generator.long_short_patterns(user_id)
            self.cached_patterns[user_id] = patterns
            self.last_update[user_id] = now
            
            # Cache to disk
            cm = CacheManager()
            tb_hash = compute_tb_hash(self.tb)
            cm.save_json(tb_hash, f"patterns_{user_id}", patterns, kind="patterns")
        
        return self.cached_patterns.get(user_id, {})
    
    def batch_update_patterns(self, user_ids):
        """Update patterns for multiple users efficiently"""
        updated_patterns = {}
        
        for user_id in user_ids:
            try:
                patterns = self.get_patterns(user_id, force_update=True)
                updated_patterns[user_id] = patterns
            except Exception as e:
                print(f"Failed to update patterns for {user_id}: {e}")
                updated_patterns[user_id] = None
        
        return updated_patterns

Pattern Validation and Quality Assessment

Pattern Quality Metrics

def assess_pattern_quality(patterns):
    """Assess the quality and reliability of generated patterns"""
    
    quality_score = 0.0
    quality_factors = {}
    
    # Home/work confidence
    home_conf = patterns.get('profile', {}).get('home', {}).get('confidence', 0)
    work_conf = patterns.get('profile', {}).get('work', {}).get('confidence', 0)
    
    quality_factors['location_confidence'] = (home_conf + work_conf) / 2
    quality_score += quality_factors['location_confidence'] * 0.4
    
    # Pattern richness (length of descriptions)
    short_richness = len(patterns.get('short', '')) / 100  # Normalize
    long_richness = len(patterns.get('long', '')) / 150   # Normalize
    
    quality_factors['pattern_richness'] = min(1.0, (short_richness + long_richness) / 2)
    quality_score += quality_factors['pattern_richness'] * 0.3
    
    # Overall confidence
    overall_conf = patterns.get('confidence', 0)
    quality_factors['overall_confidence'] = overall_conf
    quality_score += overall_conf * 0.3
    
    return {
        'quality_score': min(1.0, quality_score),
        'factors': quality_factors,
        'reliability': 'high' if quality_score > 0.7 else 'medium' if quality_score > 0.4 else 'low'
    }

Use Cases

Typical Applications:

  • Agent Background Knowledge: Help LLMs form reasonable priors about user behavior
  • Evaluation Material: Use for explaining and visualizing prediction results
  • User Profiling: Understand individual mobility characteristics
  • Anomaly Detection: Identify deviations from established patterns

Personalized Recommendation

def generate_personalized_recommendations(patterns, current_context):
    """Generate location recommendations based on patterns"""
    
    recommendations = []
    
    # Time-based recommendations
    current_hour = current_context.get('hour', 12)
    
    if 7 <= current_hour <= 9 and patterns.get('profile', {}).get('work'):
        recommendations.append({
            'type': 'work_commute',
            'location': patterns['profile']['work']['location'],
            'reason': 'Typical work commute time based on your patterns',
            'confidence': patterns['profile']['work']['confidence']
        })
    
    if 11 <= current_hour <= 14:
        # Lunch time recommendations based on short patterns
        short_pattern = patterns.get('short', '')
        if 'restaurant' in short_pattern.lower() or 'dining' in short_pattern.lower():
            recommendations.append({
                'type': 'dining',
                'reason': 'Based on recent dining patterns',
                'confidence': 0.6
            })
    
    if current_hour >= 17 and patterns.get('profile', {}).get('home'):
        recommendations.append({
            'type': 'home_return',
            'location': patterns['profile']['home']['location'],
            'reason': 'Typical home return time',
            'confidence': patterns['profile']['home']['confidence']
        })
    
    return recommendations

Performance and Optimization

Performance Notes:

  • Pattern generation is computationally light compared to LLM calls
  • Cache patterns to avoid regeneration for static datasets
  • Update patterns periodically for dynamic/streaming data
  • Consider pattern freshness for real-time applications

Next Steps