Deployment Change Log 74c4bbe1e

Showing commits from the last 14 days included in this build (automated version bump commits are excluded).

CommitAuthorDateMessage
74c4bbe1e Tim Richardson 2026-02-03T16:04:12+11:00 fix(three_pl_dermapen): Validate email format and clarify preflight errors
Replace the bare constr(min_length=1) email field on the Hexspoor
Address model with Pydantic's EmailStr so malformed addresses are
caught before the order is submitted to the 3PL.

Reword the ValidationError and AddressValidationError log/email
messages to explicitly state the order was NOT sent and to instruct
the user to correct the address in Cin7. The email subject now
includes "(not sent)" and the full preflight message is persisted
to the fulfilment row's result_message for easier auditing.
5a0494c4b Tim Richardson 2026-02-01T09:02:58+11:00 chore(eks): Add EKS 1.33→1.34 upgrade plan and fix Grafana strategy
Add Serena memory documenting the full EKS upgrade sequence from
Kubernetes 1.33 to 1.34, including critical warnings about VPC CNI
prefix delegation reset and Karpenter upgrade ordering.

Set Grafana deploymentStrategy to Recreate in kube-prometheus-stack
values to prevent dual-pod conflicts when the PVC is ReadWriteOnce,
which causes the new pod to hang waiting for volume detach.
21856b356 Tim Richardson 2026-02-01T08:00:54+11:00 fix(core): Fix lock metadata key mismatch for singleflight locks (timeout=0)
Lock metadata was stored without the hostname prefix when timeout=0
(singleflight/non-blocking mode) because the metadata_key recalculation
was gated on `if lock_metadata and timeout`. Since 0 is falsy in Python,
singleflight locks never had their metadata_key updated after the
hostname prefix was applied to lock_key.

This caused the lock list diagnostics page to be unable to find metadata
(and thus display job links) for any singleflight-coordinated lock,
including analytics updates, cache refresh operations, etc.

The metadata was stored as e.g. "ANALYTICS-UPDATE-CUSTOMERS-..._metadata"
but retrieved as "1300tempfence-ANALYTICS-UPDATE-CUSTOMERS-..._metadata".

Fix: unconditionally recalculate metadata_key after hostname prefixing.
6d2890836 Tim Richardson 2026-02-01T07:15:36+11:00 fix(valkey): Upgrade Valkey from 8.1.3 to 8.1.5 (security patch)
Addresses 4 CVEs patched in 8.1.4+:
- CVE-2025-49844: Lua script remote code execution
- CVE-2025-46817: Integer overflow and potential RCE via Lua
- CVE-2025-46818: Lua script execution in another user's context
- CVE-2025-46819: Lua out-of-bound read

Patch-level semver bump — no config or client changes required.
5fa276692 Tim Richardson 2026-02-01T06:57:11+11:00 fix(eks): Increase Karpenter EC2NodeClass maxPods from 80 to 110
During the K8s 1.34 upgrade, the VPC CNI addon reset ENABLE_PREFIX_DELEGATION
to false. With maxPods: 80 exceeding the ~58 secondary IPs available without
prefix delegation on r6a.xlarge, 16 pods got stuck in ContainerCreating for 8+
hours due to IP exhaustion. Setting maxPods to 110 provides a safer ceiling
that remains well above the non-prefix-delegation limit while Karpenter nodes
are typically CPU/memory-constrained first.
59420c228 Tim Richardson 2026-01-31T13:43:31+11:00 chore(eks): Upgrade EKS cluster from Kubernetes 1.33 to 1.34
Upgrade Karpenter from 1.6.0 to 1.8.6:
- Add StaticCapacity feature gate (required to prevent panic in 1.8)
- Add iam:ListInstanceProfiles permission (required since Karpenter 1.7)
- Fix eks:DescribeCluster resource ARN (was incorrectly pointing to IAM
  role instead of EKS cluster, causing AccessDeniedException in 1.8)

Update EC2NodeClass AMI alias from al2023@v20250915 (1.33) to
al2023@latest (1.34) so new Karpenter-provisioned nodes use 1.34 kubelet.

Remove legacy cluster autoscaler files (not deployed, replaced by
Karpenter): deployment manifest, IAM policy, and setup script.
Corresponding IAM role and policy also deleted from AWS.

Control plane upgraded to v1.34.3-eks. All managed addons updated:
kube-proxy v1.34.0, CoreDNS v1.13.1, VPC CNI v1.21.1,
EBS CSI v1.55.0, EFS CSI v2.3.0, Pod Identity Agent v1.3.10,
Node Monitoring Agent v1.5.0.

Managed node group remains on 1.33 kubelet (supported n-1 skew).
6af0b4bfd Tim Richardson 2026-01-30T14:59:49+11:00 fix(cached_dear): Extend customer cache reload window to 9999 days
Increase the full customer cache reload lookback from 3650 days
(~10 years) to 9999 days (~27 years) to ensure renamed reference
objects are detected across a wider historical window.
e10e743af Tim Richardson 2026-01-30T14:43:54+11:00 fix(cached_dear): Handle frozenset update_fields in DearCache save
Django's update_or_create may pass update_fields as a frozenset,
which is immutable and breaks the existing list.append() call.
Replace with set union to safely inject dirty_flags regardless of
the collection type.

Also update query-job-logs skill to use the correct JobMaster
field name (job_title) and remove reference to nonexistent
modified field.
ac371baaa Tim Richardson 2026-01-30T14:24:54+11:00 feat(cached_dear): Detect renamed Dear reference objects via dirty flag on customer save
Dear Inventory does not update LastModifiedOn when Price Tiers or Locations
are renamed in Settings, causing incremental cache syncs to miss changes.

Add a daily full customer cache reload (3650 days back) that re-fetches all
customers. On save, DearCache now compares PriceTier and Location jdata fields
against existing DB values. If changed, both ZOHO_ANALYTICS and
POSTGRES_ANALYTICS dirty flag bits are set, ensuring downstream analytics
consumers re-process the affected records.

Also adds DirtyFlag.ALL_CONSUMERS composite flag for convenience.
f8cfb5f76 Tim Richardson 2026-01-30T12:34:39+11:00 feat(cached_dear): Improve pricing API test fixtures and coverage
Add valid_price_tier_names fixture that fetches configured tiers from
the Dear API, ensuring account_customer fixture only picks customers
whose PriceTier resolves to a valid tier number (prevents false 422
failures from stale cache data).

Add DearCachedAPI import, non_account_customer and branch_customer
fixtures. New tests: price positivity, verbose mode on/off, cache hit
customer info preservation, non-account customer fallback behaviour.
Fix pre-existing assertions in test_invalid_product and docs template
name check.
e5dda5c74 Tim Richardson 2026-01-30T12:34:29+11:00 fix(cached_dear): Render JSON widget for view-only admin users
DearCacheAdmin view-only staff users (without change permission) saw
raw JSON instead of the ResizableJSONEditorWidget because Django's
read-only view bypasses formfield_overrides entirely.

Override has_change_permission to return True on GET for users with
view permission, forcing Django to render the full change form with
widgets. POST requests still enforce real permissions. Hide save/delete
buttons via change_view extra_context for view-only users.
e59401120 Tim Richardson 2026-01-30T11:52:07+11:00 chore(serena): Reformat project config with comments
Expand .serena/project.yml to use multi-line YAML list format for
included_optional_tools and add documented placeholder fields for
base_modes, default_modes, and fixed_tools configuration options.
No functional changes; improves readability and documents available
configuration options.
739c5199a Tim Richardson 2026-01-30T11:51:38+11:00 feat(cached_dear): Add customer resolution info to SiteHQ pricing API with region fallback
When a requested customer is not found in the cache, the pricing API now
attempts to refresh from the Dear API via get_cached_customer_row_by_id()
before falling back to the region's Branch customer. Errors are only returned
when no valid fallback exists (no region provided, or branch customer missing).

Every successful response now includes a `customer` object with requested_id,
resolved_id, resolved_name, and source fields, allowing API consumers to detect
when a fallback customer was used instead of the requested one.

Changes:
- Refactored get_cached_customer_row_by_id() to use object_uniqueID (primary key)
  instead of jdata__ID (JSONB lookup) for faster cache queries
- Moved customer existence check from serializer to _get_customer() business logic,
  enabling the region fallback path for non-existent customers
- Customer resolution info is constructed per-request (not cached) since
  requested_id varies between callers sharing the same branch customer cache
- Updated pricing_api_docs.html with Customer Resolution section, response field
  table, and fallback response example
- Added tests: nonexistent customer with valid/invalid/no region, customer info
  in success responses, and fixed pre-existing test assertions
7c4017a49 Tim Richardson 2026-01-29T14:44:43+11:00 feat(cached_dear): Add product-level discount matching with priority strategy
Extend the pricing system to match products against both category-level
and SKU-specific discount rules within Dear Deals. Previously, discounts
only matched by product category. Now the system checks both
DealDiscountCategories and DealDiscountProducts for matches.

When a product matches multiple rules (e.g., both a category rule AND a
SKU-specific rule), a new ProductDiscountPriority enum determines which
wins:
- OBEY_DEAR_SEQUENCE (default): Follows Dear's configured Sequence order
- LOWEST_PRICE: Picks the highest discount percentage

The new find_discount_in_deal_for_product() method replaces the
category-only lookup and is used by both Pricing and PricingSiteHQ
classes. Verbose API responses now include product_match_type to
indicate whether the discount matched via 'product' (SKU) or 'category'.

Adds comprehensive test coverage for all matching scenarios including
edge cases like missing Sequence fields and ProductID-based matching.
979b09360 Tim Richardson 2026-01-29T13:22:08+11:00 fix(cached_dear): Prevent $0 price leakage in SiteHQ pricing API error responses
The pricing API had a critical flaw where errors could return BOTH an error
message AND a well-formed price response with $0 prices. Downstream systems
could interpret these $0 prices as valid, leading to incorrect pricing.

Changes:
- Add custom exception hierarchy (CustomerNotFoundError, RegionFallbackError,
  PriceTierNotFoundError) in new pricing_exceptions.py module
- Refactor _get_customer() to raise domain exceptions instead of
  serializers.ValidationError
- Refactor get_customer_price() to raise PriceTierNotFoundError instead of
  returning error dict with partial data
- Add explicit exception handling in PricingSiteHQAPIView with proper HTTP
  status codes (400 for client errors, 422 for config issues, 500 for server errors)
- Update API docs with new error codes and response format
- Add tests verifying error responses contain error_code and NO product/price data

Breaking changes:
- Error responses now return proper HTTP status codes (previously all returned 200)
- Error response structure includes new error_code field for programmatic handling
a1d667ed3 Tim Richardson 2026-01-29T12:52:35+11:00 fix(revel_pos): Prevent invoice overpayment when applying multiple payments
Add overpayment protection to pay_invoices function:
- Track remaining invoice balance across payment applications
- Cap each payment amount to remaining balance with 0.005 tolerance
- Skip payments when invoice is fully paid
- Re-enable zero/negative payment amount validation
- Improve logging to show both capped and original amounts
ee94e610f Tim Richardson 2026-01-29T12:34:47+11:00 fix(shopify): Add tenacity retry to Loop API for ChunkedEncodingError (DAS-400)
Loop API requests now retry on transient connection errors that occur during
JSON parsing (ChunkedEncodingError, ConnectionError, ReadTimeout). The fix wraps
both HTTP request and JSON parsing in a single retryable unit since the error
occurs in .json(), not .get(), making urllib3's retry mechanism ineffective.

- Add _request_json() and _request_json_with_headers() retry helpers
- Refactor get_returns() pagination to use retry helpers
- Add unit tests for retry behavior
- Fix type annotations for mypy compliance

Retry config: 3 attempts, exponential backoff (2s-10s), WARNING level logging
0c7a0d323 Tim Richardson 2026-01-28T17:21:30+11:00 chore(cached_dear): Comment out InteriorIcons menu and update configs
Comment out the InteriorIcons dropdown menu section in the navigation
template as this functionality is no longer actively used. Also add
DEAR_INSTANCE_SET environment variable to the 1300TempFence PyCharm run
configuration for consistency with other entity configs.

Additionally, expand the 1300TempFence menu visibility condition to
include settings.TESTING, allowing the menu to appear in test
environments for development purposes.
0c52a31a6 Tim Richardson 2026-01-28T17:21:02+11:00 feat(cached_dear): Implement pricing rules system with Strategy pattern
Add best_deal selection method that evaluates ALL matching deals and picks
the highest discount, replacing legacy first-match behavior as the default.

New components:
- pricing_rules.py: Strategy pattern with DealSelectionMethod enum,
  LegacyFirstMatchStrategy, and BestDealStrategy classes
- test_pricing_rules.py: 20 unit tests for strategy implementations

Key changes:
- Pricing API now supports priority_method parameter (legacy/best_deal)
- Verbose mode shows all evaluated deals with was_selected flag
- Fixed delete_cache() to properly clear all customer entries using
  prefix matching instead of exact key lookup
- Updated help text for region prefix (RH=Ready Holdings, RI=Ready Industries)

Backwards compatible: "standard" maps to legacy strategy
4728f164c Tim Richardson 2026-01-28T14:14:04+11:00 feat(cached_dear,three_pl_dermapen): Add verbose mode to pricing API and enhance carrier diagnostics
Pricing API changes:
- Add verbose parameter to SiteHQ pricing endpoint for debugging
- Return diagnostic data including cache status, customer resolution
  source, price tier selection, and discount evaluation details
- Add TypedDict definitions for verbose response structure
- Add interactive test form to pricing API documentation page with
  live request/response display

Carrier diagnostics enhancements:
- Add 3PL carrier information to diagnostic results (delivery_company,
  tracking_added, shipped_date_3pl)
- Show effective carrier and its source to clarify what carrier will
  actually be used when shipping
- Update results template with new columns for 3PL carrier data

Code quality:
- Apply ruff formatting throughout pricing_utils.py and related files
- Fix type annotations and add explicit type hints
c884bd19b Tim Richardson 2026-01-28T13:07:52+11:00 feat(three_pl_dermapen): Add auto-redirect on carrier validation completion
Add automatic redirect to results page when carrier validation task
completes successfully. Uses a sentinel job_score value (9999) in the
final log message to signal the frontend WebSocket handler to redirect.

The redirect URL is passed to the template context so the JavaScript
can navigate users to the carrier_diagnostic_results page without
manual intervention.
15ebd8a19 Tim Richardson 2026-01-28T12:35:09+11:00 feat(three_pl_dermapen): Add carrier validation diagnostic tool
Implement a diagnostic system to detect orders with invalid or missing
carrier values in the ThreePL fulfillment workflow for Dermapen entities.

Key components:
- Celery task that validates unshipped orders against Cin7 Core's valid
  carrier list (case-insensitive matching)
- Django views with form to trigger diagnostics and display results
- Results cached for 1 hour, grouped by invalid carrier value
- Template UI with accordion grouping, carrier search/filter, and direct
  links to orders in Cin7 Core
- Integration tests for both pytest and manage.py shell execution

Also adds documentation for the thesaltyfox duplicate order processing
system explaining the detection and resolution workflow.
a7142ad47 Tim Richardson 2026-01-28T08:23:51+11:00 fix(three_pl): Clarify log message for header location sync
Updated the Pass 2 log message to explicitly state that the header
location change is being applied to the ThreePL_order_fulfilments row
(not the Dear order) and that the update is following a change that
originated in Dear/Cin7 Core.
e3205beae Tim Richardson 2026-01-28T08:07:22+11:00 fix(starshipit): Add email notifications for insufficient stock warnings
Ensures operational team is notified via email when orders fail due to
insufficient stock, even though these are now logged as WARNING instead
of ERROR (to reduce noise in error monitoring).
aed7afec0 Tim Richardson 2026-01-28T00:06:55+11:00 fix(starshipit): Log insufficient stock errors as WARNING instead of ERROR
Dear API 409 "Insufficient Stock" errors are transient operational conditions
caused by race conditions when stock is claimed between solution-finding and
pick creation. These are not bugs requiring developer attention.

Changes:
- Catch InsufficientStock_error explicitly and log as WARNING
- Detect 409 errors in API_error messages (when JSON parsing fails) and log as WARNING
- Other API errors and exceptions remain logged as ERROR

This reduces noise in error logs, making it easier to identify actual bugs.
48069034b Tim Richardson 2026-01-27T22:43:03+11:00 fix(revel_pos): Aggregate voided payments before sending to Dear
Add aggregate_payments_by_type() function to consolidate payments by
(payment_type, card_type) before applying them to Dear invoices. This
handles Revel POS void/reversal scenarios where a voided card payment
appears as +amount, -amount, +amount in the transaction history.

Dear API rejects negative payment amounts, so aggregating nets the
values correctly (e.g., +430, -430, +430 becomes +430). Payments with
net zero or negative amounts are skipped with diagnostic logging.

Includes comprehensive unit tests covering void scenarios, floating-
point precision, multiple payment types, and logger callback behavior.
a6337e88e Tim Richardson 2026-01-27T22:42:45+11:00 fix(dear_purchasing): Log auto_receive success with OK status instead of ERROR
The success case in process_orders_auto_receive was incorrectly logging with
ERROR status despite the message saying "succeeded". This was a copy-paste bug
where the status wasn't updated when the success case was added.

Fixes BUG-83d1d880: Success message logged with 'error' status in
Process Orders Auto Receive task.
ad4da30f6 Tim Richardson 2026-01-27T22:35:43+11:00 fix(zoho_crm): Escape commas and backslashes in Zoho Search API criteria
The Zoho CRM Search API requires commas, parentheses, and backslashes to be
escaped with a backslash when used in search criteria values. The existing
escape_zoho_characters_v2() function only escaped parentheses, causing
INVALID_QUERY errors when account names contained commas (e.g., "Smith, John").

Changes:
- Updated escape_zoho_characters_v2() to also escape commas and backslashes
- Added filtering for empty/whitespace names in get_many_zoho_accounts_by_name()
- Added get_zoho_crm_enhanced() None guard for missing ZohoCRMSettings
- Fixed all mypy type annotation issues (26 errors resolved)
- Fixed all ruff line-too-long issues in modified files
- Added unit tests for escape function edge cases

Fixes BUG-cb7e48e6: Zoho CRM API returns 400 'Invalid query formed' when
searching for accounts with special characters in names.
c49071411 Tim Richardson 2026-01-27T17:08:11+11:00 chore(3pl): Add test harness for resending stuck fulfilment orders
Utility script to debug and resend orders stuck in BAD_ADDRESS or other
error states. Features:
- Displays current fulfilment row status and both address locations
- Forces cache refresh from Dear API
- Resets status to QUEUED and clears order_json
- Optionally triggers immediate send to 3PL

Usage: kubectl -n <namespace> exec -i deploy/worker-process-pool -- python - --order-number <ORDER> [--execute] < three_pl_dermapen/test_harness_resend_queued_order.py
938f6b806 Tim Richardson 2026-01-27T17:08:03+11:00 fix(3pl): Always fetch fresh Dear data when sending orders to 3PL
The "Regenerate & Resend" button was not working because send_fulfilment_row_to_3pl
methods used get_cached_sale_row_by_order_nbr_or_id() which reads from stale cache
instead of fetching fresh data from Dear API.

Changed all 3PL managers (Hoxton, Prism, Skyzer, Hexspoor) to use
get_sale_raw_and_update_cache() which always fetches fresh data. This ensures
address corrections made in Dear are picked up when regenerating orders.

Affected files:
- manager_hoxton.py: HoxtonManager.send_fulfilment_row_to_3pl
- manager_prism_api.py: PrismAPIManager.send_fulfilment_row_to_3pl
- manager_skyzer.py: SkyzerManager.send_fulfilment_row_to_3pl
- three_pl_logic_dermapen.py: Hexspoor send flow
184918578 Tim Richardson 2026-01-27T16:37:55+11:00 chore(overlays): Retire interioricons namespace
Move interioricons overlay to retired_overlays directory.
The namespace will no longer be included in deployments.
c6149c4ad Tim Richardson 2026-01-27T16:33:04+11:00 fix(3pl): Use properly typed InitErrorDetails for Pydantic v2 ValidationError
Fixes ValueError "'error' required in context" when sending orders to
Hoxton/CartonCloud. Pydantic v2's from_exception_data() requires:
- ctx={'error': ValueError(...)} when type='value_error'
- Properly typed InitErrorDetails instead of raw dict literals

Changes:
- manager_hoxton.py: Fix 3 ValidationError calls for country validation
- manager_prism_api.py: Fix 3 ValidationError calls for address validation

The properly typed approach ensures mypy catches similar issues at
compile time rather than failing at runtime in production.
c443b1963 Tim Richardson 2026-01-27T14:56:47+11:00 fix(hexspoor): Add pre-flight phone validation with clear error message
Previously, orders with missing/empty phone numbers would fail with a
cryptic Pydantic error "phone: String should have at least 1 character"
which was confusingly logged as "has not been accepted by the 3PL".

Now validates phone before Pydantic model creation and raises
AddressValidationError with an actionable message telling users to add
a phone number in Dear/Cin7, reset to AUTHPICKPACK, and resend.

Adds parametrized unit tests for phone validation edge cases.
494d197b7 Tim Richardson 2026-01-27T12:22:37+11:00 feat(worktree): Auto-append .envrc.worktree to .gitignore on creation
When creating a new worktree, the manager now checks if .envrc.worktree
is in .gitignore and appends it if missing. This ensures worktree-specific
port configurations are never accidentally committed, even when creating
worktrees from branches that predate the .gitignore update.
6be9e8508 Tim Richardson 2026-01-27T11:33:20+11:00 fix(security): Encrypt common-secret-env.values with git-crypt
The file was previously committed before .gitattributes encryption rules
were in effect. This commit stages the properly encrypted version.

Note: Unencrypted versions may still exist in repository history.
10bae96fa Tim Richardson 2026-01-25T17:00:27+11:00 fix(starshipit): Use database skip list in DJCity connector
Updates send_packed_order_to_starshipit to read the skip list from
StarshipitSettings (like the generic Manager does) instead of only
using hardcoded defaults. This ensures that:

1. New methods added via /starshipit/analyze-shipping-methods/ are respected
2. Admins can manually add/remove methods in Django Admin
3. DJCity behaves consistently with other Starshipit integrations

The skip list is cached per-instance and auto-populated with defaults
if empty, matching the behavior of DearStarshipitManager.
61c7bea1f Tim Richardson 2026-01-25T16:50:41+11:00 fix(starshipit): Filter out Local Pickup orders in DJCity connector
DJCity orders with "Local Pickup" carrier were being sent to Starshipit
where they always fail. The generic DearStarshipitManager had this filter,
but DearStarshipitDJCity (a parallel implementation) was missing it.

Changes:
- Add ShippingMethodExcluded exception for graceful skip handling
- Add query-level filter in non_click_and_collect_query_to_send()
- Add defense-in-depth check in send_packed_order_to_starshipit()
- Update exception handlers to silently skip excluded carriers

Fixes: Orders like SO-827273/1 with carrier "Local pickup" now skip
gracefully instead of failing with Starshipit API errors.
2058563ad Tim Richardson 2026-01-25T16:05:42+11:00 fix(settings): Disable 2FA for local development
Two-factor auth is required in production but adds friction to local dev
environments. This change:

- Sets DISABLE_OTP_MIDDLEWARE flag in local_settings.py
- settings.py removes OTP middleware when flag is set
- Bypasses 2FA enforcement while keeping apps installed for URL routing

The previous approach of modifying MIDDLEWARE directly in local_settings.py
failed with NameError because MIDDLEWARE isn't defined in that namespace.
afefaba01 Tim Richardson 2026-01-25T16:01:31+11:00 feat(dev): Disable 2FA for local development
Removes OTP middleware and changes LOGIN_REDIRECT_URL to /admin/
to allow direct login without 2FA in dev environments.
3fe73a8bb Tim Richardson 2026-01-25T15:39:15+11:00 feat(worktree): Add 2FA device setup for dev superuser
The app requires 2FA for admin login. Create a StaticDevice with
a known backup code (devcode000) so developers can log in without
scanning QR codes.

Login credentials: admin / devpassword
2FA backup code: devcode000
85560afcd Tim Richardson 2026-01-25T15:16:17+11:00 fix(worktree): Fix superuser existence check
'NOT_EXISTS' contains 'EXISTS' substring, causing false positives.
Changed output tokens to USER_FOUND/USER_MISSING to avoid this.
d1219fe33 Tim Richardson 2026-01-25T15:09:48+11:00 fix(worktree): Use manage.py shell for superuser creation
Django apps must be loaded before accessing User model.
Changed 'python -c' to 'python manage.py shell -c' for all
Django ORM operations in create_superuser().
c780222ab Tim Richardson 2026-01-25T14:55:34+11:00 fix(worktree): Extract compose isolation to committed file
- Create .envrc.compose with Docker Compose project isolation logic
- Fix bug: .envrc was sourcing non-existent .envrc.worktree.py
  instead of .envrc.worktree, causing worktrees to use default
  ports and conflict with main repo
- Update worktree_manager to symlink .envrc.compose into worktrees
12411c1ee Tim Richardson 2026-01-25T14:39:11+11:00 docs(worktree): Add YAML frontmatter to worktree skill
Add name and description fields to enable proper skill discovery
and integration with Claude Code's skill system.
b33adb3a7 Tim Richardson 2026-01-25T14:12:17+11:00 docs(worktree): Update documentation for Python worktree manager
- Rewrite worktree skill to document Python script (amazon_eks/helpers/worktree.py)
- Document all commands: create, cleanup, list, ports
- Add setup_dev_environment.sh documentation for manual fixes
- Update Django 5.2 migration plan to use Python worktree manager
- Mark make_worktree.sh Python port plan as completed
- Fix comment in setup_dev_environment.sh header

The Python worktree manager replaces the old make_worktree.sh bash script,
providing automatic port allocation, git-crypt handling, Django setup,
and Docker Compose isolation for parallel development.
6b8ab24c6 Tim Richardson 2026-01-25T13:47:12+11:00 feat(youtrack): Add ticket creation capability to YouTrack skill
Add comprehensive documentation for creating new YouTrack tickets via the API:
- New usage example: /youtrack create "Title" --type Feature --subsystem tilecloud
- API request format with proper $type annotations for custom fields
- Custom field types reference table (Type, Subsystem, Priority, State, Assignee)
- Python function create_youtrack_ticket() with retry logic and credential caching
- Presentation template for created ticket confirmation
- Updated Interactive Flow to handle create action with flag parsing
b1e17608a Tim Richardson 2026-01-24T20:19:32+11:00 fix(prism_api): Change Unit of Measure from ST to EA
The Prism API expects 'EA' (Each) as the unit code, not 'ST' (Stück/piece).
77349fec1 Tim Richardson 2026-01-24T20:07:35+11:00 feat(three_pl_dermapen): Add Regenerate & Resend option and fix legacy linting issues
New Features:
- Added "Regenerate & Resend" button to 3PL row status page that clears payload
  fields (order_json, order_xml, order_xml_file_name) while preserving 3PL IDs,
  allowing orders to be resent with updated business logic without creating
  duplicates at the 3PL
- Improved user guidance with clearer descriptions for all three reset options:
  Re-poll Status, Regenerate & Resend, and Full Reset
- Enhanced warning message when AuthPickPack reset is blocked to explain why
  and suggest alternatives

Bug Fixes:
- Fixed dead code: removed redundant nested `if settings.TESTING` check
- Fixed incorrect task name `process_fulfilments_in_3pl_send_queue` to
  `task_send_queued_orders` in non-testing branch
- Fixed `raise HttpResponseBadRequest` to `raise Http404` (responses can't be raised)

Code Quality (ruff/mypy fixes):
- Added type annotations: `fields: list[str]`, `selected_ids: set[int]`,
  `app_name: str | None`
- Fixed implicit Optional types (PEP 484 compliance)
- Split long lines to meet 120 char limit
- Extracted SELECTABLE_STATUSES constant for cleaner code
- Added type ignore comment for TypedDict mutation in template context
f53a900fd Tim Richardson 2026-01-24T18:30:53+11:00 feat: Auto-commit by deployer
ca73aec07 Tim Richardson 2026-01-24T18:15:23+11:00 feat: Auto-commit by deployer
373d721f7 Tim Richardson 2026-01-24T17:53:38+11:00 feat: Auto-commit by deployer
f20dcd672 Tim Richardson 2026-01-24T17:53:21+11:00 feat(three_pl_dermapen): Add Quantity_Unit, Unit, and Order_Line_Number to Prism order items
Prism API order items now include:
- Quantity_Unit: Set equal to Quantity_Sku (1 unit = 1 SKU item)
- Unit: "ST" (Stück/piece) standard unit code
- Order_Line_Number: Sequential line numbers starting from 1

These fields align with Prism's expected payload format for outbound orders.
6e1569eb6 Tim Richardson 2026-01-24T09:07:12+11:00 chore(lint): Add ruff.toml and fix E501 line-length violations in starshipit_djcity.py
- Create ruff.toml with 120 char line-length limit and Python 3.13 target
- Enable E, F, W rule sets for pycodestyle and pyflakes
- Fix all E501 violations in starshipit_djcity.py by wrapping long f-strings
  using parenthesized string concatenation pattern
- Reformat long docstrings and comments to comply with line length
b5ee6bfaa Tim Richardson 2026-01-24T08:43:46+11:00 fix(starshipit): Handle empty ShippingAddress.Country in DJCity shipped sales
DAS-380: The task_djcity_process_shipped_sales task was failing when
orders had empty ShippingAddress.Country fields, causing InvalidAddress
exceptions to bubble up and fail the entire task.

Changes:
- Add default_country="Australia" to dear_ship_address_helper() call,
  matching the existing pattern at line 1705-1706 for DJCity orders
- Add try/except for InvalidAddress to gracefully handle orders with
  invalid addresses (empty Line1, City, or Country after defaulting)
- Log warning, email for manual review, and continue processing
  remaining orders instead of failing the whole task

This prevents one bad order from blocking shipment processing for all
subsequent orders in the batch.
0c2c2a5d9 Tim Richardson 2026-01-24T08:34:02+11:00 fix(three_pl_skyzer): Fix SimpleSuccessResponse instantiation with invalid kwarg
SimpleSuccessResponse only has a 'success' field, not 'message'.
When handling 204 No Content responses, use success=True instead.
911cd28f8 Tim Richardson 2026-01-24T08:33:18+11:00 fix(three_pl_skyzer): Add missing short_message parameter to ThreePLAPIError calls
All ThreePLAPIError and ThreePLDuplicateOrderAPIError instantiations in
skyzer_connector_api.py were missing the required 'short_message' parameter,
causing TypeError exceptions at runtime that masked the original errors.

Fixes:
- Added short_message to 6 ThreePLAPIError calls
- Added short_message and order_id=0 to fallback ThreePLDuplicateOrderAPIError
- Fixed order_id type conversion from header string to int

Resolves: DAS-385
a58c36140 Tim Richardson 2026-01-24T08:25:34+11:00 fix(cin7_sync): Raise exception instead of returning None on lock timeout
When get_promise_date_dict() failed to acquire the Redis lock, it logged
the error but returned None implicitly. Callers expecting a tuple would
then fail with "TypeError: cannot unpack non-iterable NoneType object".

This fix re-raises LockNotAcquiredAfterTimeout so the exception propagates
to the task's existing error handler, which properly logs and marks the
Celery task as failed.

Fixes: DAS-382, DAS-384
f6f8a1735 Tim Richardson 2026-01-24T08:17:58+11:00 feat(analyze_job_errors): Show all classifications in verbose mode
With --verbose flag, now displays Claude's classification for every
error, not just bugs:

  Classifications: 0 bugs, 1 operational
    [OPERATIONAL] Task: shopify.tasks.task_shopify_stock_sync_v3
      -> Network connectivity issue - the remote Shopify API...

This helps understand why certain errors aren't flagged as bugs.
c16ac5bf0 Tim Richardson 2026-01-24T08:14:38+11:00 feat(analyze_job_errors): Display token usage and costs
Add token usage tracking for Claude CLI analysis:

- New ClaudeUsage dataclass to hold token counts and cost
- Switch to --output-format json to get usage stats from CLI
- Display per-namespace token usage during analysis
- Show total usage summary at the end of run

Example output:
  Tokens: 3 in, 1657 out, 27906 cached | Cost: $0.0908
198f2b573 Tim Richardson 2026-01-24T08:08:21+11:00 feat(analyze_job_errors): Increase error message context to 2000 chars
Doubles the message truncation limit from 1000 to 2000 characters,
providing more context for Claude to analyze complex error messages
that may include partial stack traces or detailed error descriptions.
b62f5da26 Tim Richardson 2026-01-24T08:00:36+11:00 fix(analyze_job_errors): Use Sonnet model and fix OAuth/error handling
Several improvements to the Claude CLI integration:

1. Use Sonnet model instead of Opus for error analysis
   - Faster and more cost-effective for classification tasks
   - Add --model sonnet flag to CLI invocation

2. Remove ANTHROPIC_API_KEY from subprocess environment
   - Allows OAuth credentials from `claude login` to be used
   - Avoids conflicts with stale/invalid API keys in environment

3. Fix false-positive error detection
   - Recognize markdown-wrapped JSON (```json) as valid response
   - Only check first line for CLI error patterns
   - Prevents flagging valid responses containing "error" in content
77e15ffc5 Tim Richardson 2026-01-24T07:49:04+11:00 fix(analyze_job_errors): Improve Claude CLI error message handling
The Claude CLI sometimes writes error messages to stdout instead of
stderr (e.g., "Invalid API key"). This caused the script to show
"Claude CLI error: " with an empty message.

Changes:
- Check both stderr and stdout when exit code is non-zero
- Include exit code in error message for debugging
- Detect error messages in stdout even with exit code 0 (e.g., when
  the response doesn't look like JSON and contains error keywords)
efd9cfa90 Tim Richardson 2026-01-24T07:42:19+11:00 feat(analyze_job_errors): Add HTTP 500 web error detection
Extend the job error analysis script to capture user-facing HTTP 500
errors from web container logs. This addresses template rendering
errors and other UI failures that weren't being detected.

Changes:
- Add WebErrorEntry dataclass for HTTP 500 error data
- Add extract_web_errors_from_logs() function that:
  - Detects HTTP 500 responses from Uvicorn/h11 logs
  - Associates nearby Python tracebacks (±2 second window)
  - Skips static files and health checks
- Update NamespaceErrors to include web_errors field
- Add "Web Errors (HTTP 500)" section to generated report
- Update summary and namespace details with web error counts
- Generate deterministic BUG-IDs for web errors using URL path

Web errors are grouped by URL path in the report, showing affected
namespaces and example tracebacks when available.
726b247d2 Tim Richardson 2026-01-24T07:35:02+11:00 docs: Add design for web error detection in analyze_job_errors.py
Extend the job error analysis script to capture HTTP 500 errors from
web container logs, associate them with nearby tracebacks, and report
them separately from background task errors.

Key additions planned:
- WebErrorEntry dataclass for HTTP 500 data
- Detection patterns for Uvicorn 500 responses
- Traceback association within ±2 second window
- Separate "Web Errors (HTTP 500)" report section
- Integration with Claude analysis and YouTrack ticketing
f400662aa Tim Richardson 2026-01-24T07:13:50+11:00 feat: Auto-commit by deployer
5fb49ecba Tim Richardson 2026-01-24T07:05:22+11:00 feat: Auto-commit by deployer
c894abd95 Tim Richardson 2026-01-24T06:59:22+11:00 feat: Auto-commit by deployer
fab821967 Tim Richardson 2026-01-24T06:49:24+11:00 feat: Auto-commit by deployer
5cc5ebf43 Tim Richardson 2026-01-24T06:40:58+11:00 feat: Auto-commit by deployer
b529e579a Tim Richardson 2026-01-23T16:39:59+11:00 feat: Auto-commit by deployer
eb5d87f88 Tim Richardson 2026-01-23T13:40:46+11:00 fix(cin7_sync): Add SHORT_HOSTNAME prefix to lock keys and pass lock_metadata
Lock keys now include settings.SHORT_HOSTNAME prefix so they appear
in the shared lock diagnostics view (list_redis_locks).

Changes:
- cin7_cached_api.py: Prefix global_cache_lock_name with SHORT_HOSTNAME,
  add lock_metadata to refresh_cache_object and update_stock
- cin7_cache_coordinator.py: Prefix singleflight lock keys with SHORT_HOSTNAME
- tasks.py: Add lock_metadata to rebuild_cin7_stock_cache and stock_data task
- stock_in_warehouse_report.py: Add lock_metadata, fix bug where job_id
  parameter was always overwritten to None

This aligns cin7_sync locking patterns with cached_dear patterns,
enabling the diagnostics view to show all active locks with their
associated job_id and hostname metadata.
422d3349d Tim Richardson 2026-01-23T11:47:14+11:00 docs: Add design documents for lock management and pytz migration
- shared-lock-management-design.md: Template consolidation approach
- pytz-to-zoneinfo-migration.md: Migration plan for cin7_sync module
16f4ed647 Tim Richardson 2026-01-23T11:46:37+11:00 fix: Add type stubs and fix invalid type comments
Add missing type stub packages:
- types-python-dateutil
- types-cachetools

Fix invalid type comment syntax that was blocking mypy:
- shopify_connector.py: Add Generator return type annotation
- starshipit_djcity.py: Convert type comments to inline annotations,
  fix Dict[K:V] syntax to Dict[K, V], fix Q import path
- order_autopicking_djcity.py: Remove redundant type comment

These fixes resolve mypy syntax errors while maintaining type safety.
a75f4a1c7 Tim Richardson 2026-01-23T11:46:30+11:00 refactor(cin7_sync): Migrate pytz to stdlib zoneinfo
Replace deprecated pytz with Python 3.9+ stdlib alternatives:
- pytz.utc → datetime.UTC (Python 3.11+)
- pytz.timezone() → zoneinfo.ZoneInfo()

This removes the pytz dependency from the cin7_sync module while
maintaining identical timezone handling behavior.
8df087a5a Tim Richardson 2026-01-23T11:46:24+11:00 refactor: Move lock management templates to core for shared use
Move HTMX-powered lock management templates from cached_dear to
core/templates/core/diagnostics/ so they can be shared between
cached_dear and cin7_sync apps.

Templates now use context variables for URL generation:
- list_locks_partial_url, release_single_lock_url, reset_locks_url
- list_locks_url, index_url (for navigation)

This enables both apps to use identical templates while maintaining
their own URL namespaces.
4783a64c9 Tim Richardson 2026-01-23T11:10:09+11:00 feat(cin7_sync): Add API request tracking and acks_late=False to tasks
Add lightweight cache-based API request tracking to identify which Celery
tasks are driving Cin7 API demand:

- Track requests per task name and hour bucket in Valkey cache
- Separate tracking for 429 rate limit hits
- New view at /cin7_sync/api_request_stats/ shows breakdown
- Stats persist for 24 hours with automatic expiry

Also add acks_late=False to all Cin7 Celery tasks to prevent re-queuing
on failure, matching the pattern in cached_dear/tasks_cache.py.

Tasks updated: rebuild_cin7_stock_cache, rebuild_cin7_purchase_order_cache,
rebuild_cin7_contacts_cache, rebuild_cin7_payments_cache, api_order_promise_date,
task_update_neto_with_promise_dates, task_update_shopify_with_promise_dates,
task_satara_neto_custom_fields, task_update_po_with_linked_orders,
task_generate_cin7_promise_dates_report, task_compare_bom_vs_legacy_promise_dates
f5df715de Tim Richardson 2026-01-23T09:40:11+11:00 feat(cin7_sync): Add ±20% jitter to rate limit backoff timing
Prevents thundering herd when multiple worker processes hit rate limits
simultaneously - they now wake up at slightly different times instead
of all retrying at exactly the same moment.
b5a86adff Tim Richardson 2026-01-23T08:50:22+11:00 fix(cin7_sync): Make get_next_auth() wait for keys instead of raising immediately
When all API keys are in cooldown, get_next_auth() now waits for the
soonest key to become available (up to 5 minutes max wait) instead of
raising Cin7RateLimitError immediately.

This fixes a gap where tasks would fail immediately if starting when
all keys happened to be in cooldown, rather than waiting for recovery.

The change simplifies _make_request_with_rate_limit_handling() since
the wait logic is now centralized in get_next_auth().
bf3a9905a Tim Richardson 2026-01-23T08:40:15+11:00 feat: Auto-commit by deployer
b8e4212a2 Tim Richardson 2026-01-23T08:40:06+11:00 feat(cin7_sync): Remove circuit breaker and tune rate limit backoff timing
Replace the redundant circuit breaker with simpler exponential backoff:
- Remove all circuit breaker code (constants, methods, checks)
- Reduce base cooldown from 60s to 5s (aligned with Cin7's 60/min limit)
- Cap max cooldown at 60s (the per-minute quota reset window)
- Increase retries from 5 to 7 for better recovery

New backoff sequence: 5s → 10s → 20s → 40s → 60s → 60s → 60s
Total max wait ~255s (~4 min) before giving up.

Also add test harness for observing rate limit behavior in production.
a44a19637 Tim Richardson 2026-01-23T08:08:37+11:00 feat(cin7_sync): Add exponential backoff for per-key rate limit cooldowns
When a Cin7 API key receives a 429 rate limit response, consecutive failures
now result in progressively longer cooldowns (60s → 120s → 240s → 300s max)
instead of a fixed 60-second cooldown. This prevents "thundering herd" when
all keys recover simultaneously and immediately trigger another rate limit
storm.

Changes:
- Cache now stores {cooldown_until, failure_count} dict instead of timestamp
- Added _reset_key_failure_count() to clear backoff state on successful requests
- Backward compatible: handles both old (timestamp) and new (dict) cache formats
- Logs now show failure count and backoff multiplier for observability
ec4ff5b28 Tim Richardson 2026-01-22T21:19:42+11:00 fix(cin7_sync): Fix mypy type errors and remove legacy type comments
- Remove legacy `# type:datetime` inline comments that caused parse errors
- Fix merged lines that resulted from comment removal
- Update function parameters to use `int | None` instead of `int = None`
- Update datetime parameters to use `datetime | str | None` syntax
- Add explicit return statements where missing
- Remove unused `Union` import (now using | syntax)
094961b85 Tim Richardson 2026-01-22T21:08:56+11:00 chore(deps): Remove outdated types-urllib3 stub package
urllib3 2.x ships with inline type annotations (py.typed marker), making
the separate types-urllib3 package unnecessary. The stub package only
covered urllib3 1.x API and caused false mypy errors for valid 2.x
parameters like `backoff_max` and `respect_retry_after_header`.
6bfd01a2d Tim Richardson 2026-01-22T21:05:14+11:00 feat(helpers): Enhance healthcheck analyzer with full implementation
Implements the complete healthcheck failure investigator per the design:
- Fetch failed checks from healthchecks.io API (status='down')
- Extract namespace from tags with fallback inference from check name
- Gather K8s context: deployment status, JobLogs, pod log tracebacks
- Find relevant Celery task source code via grep
- Claude CLI triage with structured JSON output (summary, cause, hints)
- YouTrack ticket creation with HC-<uuid[:8]> deduplication
- Markdown report generation grouped by severity

CLI options:
  --create-tickets    Create YouTrack tickets for failures
  --check-uuid        Investigate single check by UUID
  --output            Write report to file
  --skip-investigation  Skip Claude, just gather context
  --verbose           Show detailed progress
287c95342 Tim Richardson 2026-01-22T21:03:53+11:00 chore: Add analyze_health_checks helper
37930e4e5 Tim Richardson 2026-01-22T21:03:40+11:00 feat: Auto-commit by deployer
0e7b89438 Tim Richardson 2026-01-22T21:03:35+11:00 fix(cin7_sync): Prevent silent task hangs from urllib3 Retry-After sleeps
Root cause: When Cin7 API returned 5xx errors with Retry-After headers,
urllib3's default behavior (respect_retry_after_header=True) caused tasks
to sleep silently for extended periods with no logging, appearing "dead".

Changes:
- Disable urllib3's respect_retry_after_header to prevent silent sleeps
- Add backoff_max=30 to cap exponential backoff at 30 seconds
- Reduce lock TTL from 7200s to 300s (matching cached_dear pattern)
  so locks expire in 5 minutes instead of 2 hours if task dies
- Add missing delete_all parameter to rebuild_cin7_contacts_cache task

Also applied via database: Staggered 12:05 task schedules (sales order
cache at 12:05, purchase order cache at 12:15) to prevent lock contention.
6097a8775 Tim Richardson 2026-01-22T21:00:21+11:00 chore(revel_pos): Remove auto-create suggestion from PO validation emails
The "Or enable auto-create in sync settings" text was removed from
NOT_IN_REVEL error messages to provide clearer, more direct fix instructions
to users when products are missing from Revel POS.
218cde5b0 Tim Richardson 2026-01-22T20:46:37+11:00 docs(plans): Add healthcheck failure investigator design
Design for a script that monitors healthchecks.io for failed Celery task
heartbeats, uses Claude CLI to triage failures, and creates YouTrack tickets.

Key features:
- Fetches failed checks from healthchecks.io API
- Extracts namespace from tags (with fallback inference from name)
- Gathers K8s context: JobLogs, pod logs, deployment status
- Finds relevant task source code
- Claude CLI produces triage summary (severity, likely cause, hints)
- Creates YouTrack tickets with HC-<uuid> deduplication
- Manual/on-demand execution
2783f96d5 Tim Richardson 2026-01-22T15:01:18+11:00 fix(cached_dear): Fix customer autocomplete in Delete Pricing Cache form
- Enhanced customer_autocomplete() to return customer GUID in response
  for forms that need to populate a customer_id field directly
- Fixed DeletePricingCacheForm data-server-params to reference the correct
  hidden field (id_dear_instance_name instead of non-existent id_customer_search_type)
- Added hidden dear_instance_name field to form for autocomplete endpoint
- Replaced broken event listener JS with proper Bootstrap5-Autocomplete
  initialization using onSelectItem callback
- Renamed menu item from "1300 TempFence" to "1300TempFence" (no space)
- Added Pricing API Documentation page and comprehensive test suite

The autocomplete now properly:
1. Sends dear_instance_name parameter via hidden field
2. Uses the library's onSelectItem callback (not custom events)
3. Populates the customer GUID into customer_id field on selection
4. Provides visual feedback on successful selection
88df17495 Tim Richardson 2026-01-22T11:33:26+11:00 feat(devenv): Gitignore .envrc and improve template-based setup
Security: .envrc now gitignored since it contains secrets (YOUTRACK_TOKEN,
AWS credentials). Developers copy from .envrc_template and add their own
credentials.

Template improvements:
- Full uv-based venv setup with Python 3.12
- Auto-sync packages from requirements-deploy.txt
- Docker Compose isolation for parallel worktrees
- Placeholders for secrets with TODO comments

Tooling updates:
- setup_dev_environment.sh: Auto-copies template to .envrc if missing
- worktree_manager: Symlinks .envrc from main repo to worktrees
  (secrets stay in sync, port overrides via .envrc.worktree)
46588e270 Tim Richardson 2026-01-22T11:26:38+11:00 feat(cached_dear): Add pricing cache management web interface
Add a form-based UI for deleting pricing cache entries at
/cached_dear/pricing-cache-management/. The page provides:
- Form to delete cache by customer ID or base cache key
- Customer name autocomplete search
- API endpoint documentation with curl/Python examples
- Explanation of cache key structure and behavior

Also removes .envrc from version control as it contains secrets
(YOUTRACK_TOKEN, etc.). Developers should copy .envrc_template
and configure their own credentials.

New files:
- cached_dear/pricing_forms.py: DeletePricingCacheForm
- cached_dear/pricing_views.py: DeletePricingCacheFormView
- cached_dear/templates/cached_dear/delete_pricing_cache.html
7d97a2ef8 Tim Richardson 2026-01-21T21:47:11+11:00 feat(devenv): Complete Python worktree manager with setup automation
Replace bash scripts with Python implementation that provides:
- Automatic Django migrations after worktree creation
- Fixture setup via new `setup_dev_fixtures` management command
- DearInstanceSettings copying from main repo via dumpdata/loaddata
- Superuser creation with configurable credentials
- Login page verification to confirm successful setup
- Copy of gitignored build artifacts (requirements.txt, build/)

Fix squashed migration for fresh databases:
- core/migrations/0001_squashed_initial.py had RenameModel and
  RenameIndex operations that fail on fresh databases (the old
  model/index names never existed). Fixed by using final model names
  directly in CreateModel operations and removing Rename* operations.

Files removed (replaced by Python implementation):
- make_worktree.sh, cleanup_worktree.sh, worktree_ports.sh
- amazon_eks/helpers/make_worktree.sh (was symlink)

New command: python manage.py setup_dev_fixtures
- Creates core.Module entries for all registered apps
- Optionally creates DearInstanceSettings from environment variables
- Use --skip-dear when loading fixtures via dumpdata/loaddata instead
ae1a10cc5 Tim Richardson 2026-01-21T18:50:57+11:00 feat: Auto-commit by deployer
7e91945bf Tim Richardson 2026-01-21T14:48:24+11:00 fix(devenv): Add explicit Docker volume mount for private/ directory
Docker bind mounts don't follow symlinks, causing worktree containers
to fail with "ModuleNotFoundError: No module named 'private'".

Changes:
- docker-compose.yml: Add ${PRIVATE_DIR:-./private}:/opt/app/private:ro
  to x-web-common volumes (read-only mount)
- .envrc: Set PRIVATE_DIR=$PWD/private for main repository
- make_worktree.sh: Include PRIVATE_DIR in generated .envrc.worktree,
  pointing to main repo's private/ directory

The :ro (read-only) flag ensures credentials aren't accidentally modified
from inside containers. All worktrees share the same private/ directory,
which is the desired behavior for shared credentials.
aa7be4d44 Tim Richardson 2026-01-21T14:44:34+11:00 feat(devenv): Enable parallel development environments with git worktrees
Add Docker Compose project isolation so multiple worktrees can run
simultaneously with independent databases, networks, and ports.

Changes:
- docker-compose.yml: Parameterize host ports with ${VAR:-default} syntax,
  remove container_name directives, remove external: true from volumes/networks
- .envrc: Add section 6 for COMPOSE_PROJECT_NAME and port defaults,
  source .envrc.worktree if present for worktree-specific configuration
- make_worktree.sh: Add helper functions for port calculation (md5-based
  deterministic offset), generate .envrc.worktree with unique ports,
  symlink untracked content (private/, local_settings.d/), initialize
  submodules, check for uncommitted changes before proceeding
- setup_dev_environment.sh: Replace external network/volume creation with
  COMPOSE_PROJECT_NAME info display (resources now auto-created by Compose)
- New cleanup_worktree.sh: Safe teardown with dirty-state protection,
  stops Docker stack, removes containers/networks, optionally removes
  volume with explicit confirmation
- New worktree_ports.sh: Port discovery helper for checking allocations

Port allocation strategy:
- Main repo: Original ports (8000, 8080, 5433, 6379, 8090)
- Worktrees: Base + deterministic offset (0-99) from branch name hash
  Web=8000+offset, WS=8100+offset, DB=5400+offset, Valkey=6300+offset

Usage:
  ./make_worktree.sh feature-xyz
  cd ~/worktrees/feature-xyz && docker compose up
  ./cleanup_worktree.sh feature-xyz
4ca9b0f02 Tim Richardson 2026-01-21T14:36:51+11:00 chore(docs): Consolidate documentation/ into docs/
Merge documentation/ directory into docs/ for conventional naming.
All 62 files moved with git mv to preserve history.

The docs/ directory now contains:
- All previous documentation/ content
- openapi/ subdirectory (unchanged)
- plans/ subdirectory (unchanged)
96e9da58c Tim Richardson 2026-01-21T14:34:22+11:00 feat(dear_zoho_analytics): Suppress non-actionable missing sales alerts
Add classification rules to filter out missing sales that cannot be repaired,
reducing noise from repeated WARNING emails for permanently non-actionable sales.

Non-actionable classification rules:
- VOIDED status: Voided sales never have financial data to sync
- Credit-note-only: Has authorised CN but no AUTHORISED/PAID invoice
- Old terminal state: >3 years old AND COMPLETED/VOIDED status

Behavior changes:
- Actionable missing sales: WARNING log + email (unchanged)
- Non-actionable missing sales: INFO log only (no email)
- Summary shows breakdown by reason (VOIDED, CN-ONLY, OLD)

Also fixes pre-existing mypy type errors in:
- zoho_integrity_verification.py: Variable shadowing, type annotations
- analytics_tasks.py: Callback types, None guards, variable shadowing
- analytics_backend.py: Union type for AnalyticsTableZohoDef_v2
- dear_analytics_logic.py: refresh_purchase_tables signature
bc4dea043 Tim Richardson 2026-01-21T14:12:15+11:00 chore(docs): Add memory note on Python kubectl stdout buffering
Creates a new documentation memory note explaining the Python stdout buffering issue when using `kubectl exec`. This memory document provides a comprehensive guide for addressing buffering problems during script execution, including code examples and best practices.

The note helps developers understand why output might appear to hang during kubectl script execution and provides multiple solutions to ensure real-time, unbuffered output.
3f035f4f5 Tim Richardson 2026-01-21T14:12:02+11:00 feat(revel_pos): Add configuration page for notification email settings
Add a new Settings page to Revel POS that stores notification email
addresses in KeyValueJson, replacing hardcoded email recipients throughout
the codebase.

Changes:
- Add config.py with get_revel_notification_emails() accessor function
- Add RevelSettingsForm with email validation (one per line format)
- Add RevelSettingsView and /settings/ URL route
- Add Settings link to navbar menu and Settings card to index page
- Replace 10 hardcoded emails in restock_orders.py
- Replace 3 hardcoded emails in tasks.py
- Add HTML email support in JobLog.log_message() for rich notifications

Default behavior returns ['tim@growthpath.com.au'] when no settings exist,
ensuring backward compatibility during rollout.
0d143ec3c Tim Richardson 2026-01-21T13:17:42+11:00 fix(three_pl): Fix stdout buffering in cleanup script for kubectl exec
- Add sys.stdout.reconfigure(line_buffering=True) for unbuffered output
- Add log() helper function with flush=True for kubectl exec compatibility
- Simplify logic to directly target known test IDs instead of scanning
- Remove complex queries that could cause slow execution

Without these changes, the script appeared to hang when piped via
kubectl exec because Python buffers stdout when stdin is redirected.
fbcf01403 Tim Richardson 2026-01-21T13:06:22+11:00 feat(three_pl): Add cleanup script for old Coco EDI test 850 transactions
Create a specialized cleanup script to remove outdated test EDI transactions
that were causing issues with order matching. The script:

- Identifies test 850 transactions based on multiple criteria
- Supports dry-run and execute modes
- Finds and deletes related 855, 856, and 810 transactions
- Provides detailed logging and summary of actions
- Helps resolve SO-116921 issue with incorrect order matching
6bff1a4df Tim Richardson 2026-01-21T12:50:07+11:00 feat(three_pl): Enhance PO and EDI transaction investigation script
Add advanced investigation functions for troubleshooting EDI transaction
linkage issues, specifically:

- `find_all_850s_for_po()`: Retrieve and analyze all 850 transactions
  for a given PO number
- `check_unlinked_recent_850s()`: Identify unlinked 850 transactions
  that could potentially link to a sale
- Enhanced main investigation flow to detect TSCN (Transaction Set
  Control Number) collisions

The new functions help diagnose complex EDI linkage problems by providing
detailed transaction insights and highlighting potential linking errors
in the inbound file manager.
990200dca Tim Richardson 2026-01-21T12:45:07+11:00 fix(coco_edi): Rate-limit EDI warning logs to once per day per sale
Added cache-based rate limiting to prevent log spam when EDI sales are
missing linked 850 transactions or ASNs. Warnings now include a helpful
hint about adding ***EDISKIP*** to the Sale Note field.

Changes:
- ASN task: Rate-limit "no matching 850" warning (24h TTL per sale)
- Invoice task: Rate-limit "no ASN found" warning (24h TTL per sale)
- Invoice task: Add try/except for 850 lookup with rate-limited warning
- All warnings now include the EDISKIP hint for resolution

Uses Django cache with 24-hour TTL keyed by sale ID.
54aa097ed Tim Richardson 2026-01-21T12:39:03+11:00 fix(coco_edi): Filter 850 transaction lookup by interchange to prevent TSCN collisions
When processing inbound EDI interchanges, the transaction_row lookup was only
filtering by message_type and transaction_set_control_number. If Coco reuses
TSCNs across different interchanges (which EDI allows), the .first() could
return an old transaction from a different interchange, causing incorrect
sale-to-PO linkage.

Changes:
- Look up the interchange database row using ISA header fields (sender_id,
  receiver_id, interchange_control_number)
- Filter the EDIX12TransactionJournal lookup by interchange when available
- Add warning log if interchange row cannot be found (graceful degradation)

This prevents scenarios where a sale gets linked to the wrong 850 PO when
TSCNs are reused across interchanges.
b629a971b Tim Richardson 2026-01-21T12:34:19+11:00 docs: Update documentation to reference mypy instead of ty
Update CLAUDE.md and Serena memories to reflect the migration from
Astral's ty type checker to mypy with django-stubs:
- CLAUDE.md: Update type checking instructions
- code_style memory: Add mypy configuration details
- suggested_commands memory: Add mypy command examples
d4f4c74e7 Tim Richardson 2026-01-21T12:32:46+11:00 chore: Migrate from ty to mypy with django-stubs for type checking
Replace Astral's ty type checker with mypy + django-stubs plugin for
better Django type support. django-stubs understands Django's dynamic
attributes (get_FOO_display(), ForeignKey _id fields, QuerySets) which
eliminates many type: ignore comments.

Changes:
- Add mypy.ini with django-stubs plugin configuration
- Replace ty and django-types with django-stubs[compatible-mypy]
- Add types-requests for request library stubs
- Remove unnecessary type: ignore[unresolved-attribute] comments
- Keep type: ignore[attr-defined] for Django admin display attributes
  (short_description, admin_order_field) which django-stubs doesn't cover
- Fix 4 pre-existing type bugs in KeyValueJson methods:
  - get_timestamp: handle None value_datetime with fallback
  - get_jdata: remove invalid type comment
  - get_value: guard against None before calling .encode()
- Fix invalid type comment in order_autopicking_generic.py
adbd75279 Tim Richardson 2026-01-21T11:49:07+11:00 refactor(amazon_eks): Remove per-site git status checks in deployment
Simplify deployment process by removing redundant git status checks for
each site. Git status verification is now performed only once before
Docker image build, recognizing that container images are immutable once
pushed to ECR.

This change prevents unnecessary deployment blocking and streamlines the
deployment workflow in the TUI deployer.
247b4689c Tim Richardson 2026-01-21T11:43:09+11:00 feat(coco_edi): Add defensive PO number validation for EDI transactions
Implement additional validation to ensure 850 purchase order numbers
match the sale's CustomerReference before generating ASN and invoice
transactions. This helps prevent data integrity issues caused by
incorrectly linked or stale test data.

Key changes:
- Added PO number validation in find_shipments_and_make_asn()
- Added PO number validation in find_invoices_and_make_810s()
- Log warning messages with details when PO number mismatches occur
- Add script to investigate EDI transaction linkage issues
acd38ccd2 Tim Richardson 2026-01-21T11:42:57+11:00 feat(cached_dear): Add multi-warehouse selection with case-insensitive filtering
Replace single warehouse text field with multi-select dropdown populated from
Dear locations. Warehouses are now filtered case-insensitively, and multiple
warehouses can be selected.

Changes:
- Form: warehouse_name CharField -> warehouse_names MultipleChoiceField
- View: get_location_choices() fetches locations from DearCache (active first,
  then deprecated)
- Invoice logic: Uses Lower(KeyTextTransform()) for case-insensitive filtering
- Tasks: Backward compatible - reads warehouse_names from config, falls back to
  warehouse_name parameter
d0b388e2c Tim Richardson 2026-01-21T11:25:42+11:00 fix(cached_dear): Sync auto-invoicing enabled checkbox with periodic task state
When feature_settings is empty (no config saved yet), the form now reads
the periodic task's actual enabled state instead of defaulting to False.
This ensures pre-existing tasks like Canton Tea's show as enabled.

Also changed consider_draft_invoices default from True to False to
preserve historical behavior (old code didn't count DRAFT invoices).
32451d5c2 Tim Richardson 2026-01-21T11:15:32+11:00 feat(cached_dear): Add auto-invoicing configuration UI with DRAFT invoice support
Add frontend configuration view to manage auto-invoicing behavior:
- New feature_settings JSONField on DearInstanceSettings for extensible config
- Auto-invoicing settings page under Misc Functions menu
- "Consider DRAFT invoices" option prevents duplicate invoices when DRAFT exists
- Settings read at runtime, so UI changes take effect immediately
- Periodic task status display with enable/disable toggle

Also fixes legacy type issues across invoice_logic.py, models.py, and tasks.py:
- Nullable parameter defaults (Type | None = None)
- Django magic method type ignores (get_*_display)
- TypedDict casts for API calls

Migration: 0064_add_feature_settings_to_dearinstancesettings
2f634da03 Tim Richardson 2026-01-21T10:45:20+11:00 chore(envrc): Improve Python virtual environment setup
Switch from generic layout python3 to explicit Python 3.12 venv
creation using uv. This provides more predictable and controlled
virtual environment management across different development
environments.
c4d8d361e Tim Richardson 2026-01-21T10:45:01+11:00 fix(dear_zoho_analytics): Refactor raw SQL to Django ORM and fix type errors in zoho_integrity_verification
DAS-378: The find_missing_purchasing_documents function was using raw SQL
with manual cursor handling, which may have caused connection issues in
long-running Celery tasks. Refactored to use Django ORM which handles
connection lifecycle automatically.

Changes:
- Replace raw SQL query with DearCache.objects.filter().values_list()
- Remove try/except block (ORM exceptions bubble up with full traceback)
- Add None guards for zoho_enhanced_client at 5 locations to fix ty type errors
- Apply ruff formatting to entire file

The ORM refactor should resolve the underlying connection issue. If errors
still occur, they will now include full exception details in Celery logs.
a019847b7 Tim Richardson 2026-01-20T19:52:40+11:00 feat(cached_dear): Add composite indexes for PO integrity check performance
Migration adds two indexes to optimize DearCache queries:
1. idx_dearcache_object_type_account_id - composite index on (object_type, dear_account_id)
   Speeds up WHERE clause filtering for queries like the PO integrity check
2. idx_dearcache_jdata_id - index on jdata->>'ID'
   Supports any legacy queries still using JSONB extraction

Uses CREATE INDEX CONCURRENTLY with atomic=False to avoid locking the table
during index creation on large production databases.
132b57c31 Tim Richardson 2026-01-20T19:48:13+11:00 perf(dear_zoho_analytics): Use object_uniqueID instead of JSONB for PO integrity check
The PostgreSQL query to get PO IDs from DearCache was taking ~5.5 minutes
because it extracted from JSONB: `jdata ->> 'ID'`

Changed to use the indexed `object_uniqueID` column directly, which should
reduce this to seconds.
14acd87a7 Tim Richardson 2026-01-20T19:24:38+11:00 perf(dear_zoho_analytics): Optimize PO integrity check Zoho query with GROUP_CONCAT
The PO integrity check was slow for accounts with large numbers of POs (e.g., 44,000+)
because it fetched each PO GUID as a separate row from Zoho Analytics.

Added get_all_unique_po_guids_v2() which uses the GROUP_CONCAT trick (already used
for sales) to batch 1000 GUIDs per row. This reduces the row count from 44,000 to ~44,
dramatically improving query time and reducing API costs.

Also added timing logs to get_unique_ids_from_zoho_masterdata for diagnostics.
d180df640 Tim Richardson 2026-01-20T19:08:51+11:00 fix(dear_zoho_analytics): Add safeguard to PO integrity check for incomplete Zoho results
The PO integrity check was incorrectly marking all purchase orders as "missing"
when the Zoho API query returned incomplete results (e.g., due to timeout).
This caused unnecessary full re-syncs of tens of thousands of POs.

Added a safeguard that aborts the integrity check if >50% of POs appear
"missing" from Zoho, as this likely indicates the Zoho query failed to return
complete results rather than actual missing data.

Thresholds:
- MISSING_THRESHOLD_PERCENT = 50 (abort if more than half appear missing)
- MINIMUM_DEAR_POS_FOR_CHECK = 100 (only apply safeguard for accounts with
  substantial data)
13850c1f8 Tim Richardson 2026-01-20T18:16:26+11:00 chore(cached_dear): Add test harness to rename Product Family option 'Type' to 'Product Type'
Script to update Product Family option names in Dear API. Finds families where
Option1Name, Option2Name, or Option3Name equals "Type" (case insensitive) and
updates them to "Product Type".

Features:
- Dry-run mode by default (safe to run without changes)
- --execute flag required to apply updates
- Designed to run on worker pod via kubectl exec
- Logs all matches and update results

Usage:
  # Dry run
  kubectl -n loam exec -i deploy/worker-process-pool -- python - < cached_dear/test_harness_update_product_family_options.py

  # Execute changes
  kubectl -n loam exec -i deploy/worker-process-pool -- python - --execute < cached_dear/test_harness_update_product_family_options.py
b1b36ae23 Tim Richardson 2026-01-20T18:15:40+11:00 fix(dear_zoho_analytics): Prevent PO integrity check from incorrectly deleting all purchase orders
The purchase order integrity check was incorrectly flagging all POs as orphans
and deleting them from Zoho on every run, causing a full re-sync cycle.

Root cause: The Dear cache query filtered by last_modified date (30 days), but
the Zoho query fetched ALL POs. When comparing these mismatched sets, any Zoho
PO not modified recently was incorrectly classified as an "orphan" and deleted.

Changes:
- Add unfiltered Dear cache query (all_dear_po_ids) for true orphan detection
- Fix orphan logic: only flag POs that don't exist in Dear cache at all
- Add diagnostic warning when orphan ratio exceeds 50% (indicates data issue)
- Improve type annotations to use modern Python 3.10+ union syntax (int | None)
- Fix ProductAvailabilityType loop variable conflict in verify_product_avail
- Add AnalyticsTableZohoDef_v2 to refresh_product_availability_v2 signature
- Fix cursor.fetchone() None handling in fix_missing_and_unwanted_records

The sales integrity check did not have this bug - it correctly queries all Dear
cache records without date filtering.
206f17321 Tim Richardson 2026-01-20T16:55:16+11:00 refactor(dear_zoho_analytics): Improve DearCache flag update concurrency handling
Refactored the `update_dear_cache_flags` function to use `select_for_update(skip_locked=True)`
for more robust concurrent processing. Key improvements include:

- Replaced complex deadlock retry mechanism with a simpler, more reliable approach
- Added explicit handling of locked rows using SKIP LOCKED
- Improved logging to track updated, skipped, and failed rows
- Maintained transaction atomicity while preventing process blocking
- Added batch progress tracking

This change reduces the risk of deadlocks and improves performance in scenarios
with multiple concurrent analytics tasks updating DearCache records.
3ed10e83b Tim Richardson 2026-01-20T16:43:21+11:00 feat(starshipit): Improve skip list UX - copy defaults to field and add documentation
- Updated generic_formview_template to display description and notes
- Added info box to Analyze Shipping Methods view explaining where skip
  list is saved and listing default methods
- Changed defaults handling: now copied into field on first access
  rather than merged at runtime, giving admins full visibility and control
- Admin can now remove default methods if they configure Starshipit rules
03fc76d95 Tim Richardson 2026-01-20T16:37:07+11:00 fix(starshipit): Rename 'Admin' menu to 'Configuration' to avoid confusion with Django Admin