Saturday, June 13, 2026

Do CIOs need to create an Enterprise AI Harness?

SUMMARY: If the cost of public AI continues to rise, because of various market shortages, should CIOs start looking at backup plans to better own their AI journeys and futures?

SHOW: 1036

SHOW TRANSCRIPT: The Enterprise AI Show #1036 Transcript

SHOW VIDEO: https://youtu.be/ZgkMF7G3Yfo

SHOW SPONSORS:

SHOW NOTES:


THESIS: It comes up as different control points, but CIOs are ultimately trying to figure out how to get the value from Enterprise AI while delivering a set of consistency across different teams and use-cases. Let’s explore what this “Enterprise Harness” is starting to look like. 

  • Enterprise Clearinghouse 
  • Enterprise Intelligence (a.k.a. Middleware)
  • Enterprise Catalog - Models as a Service, Agents as a Service
  • Enterprise Skills or Shareable Prompt Harnesses
  • Symantec Routing to Models
  • AI Gateway Controls

FEEDBACK?



from The Cloudcast (.NET) https://ift.tt/KSDx2HA
via IFTTT

Critical Splunk Enterprise Flaw Lets Attackers Run Code Without Authentication

Splunk has released security updates to address a critical security flaw in Splunk Enterprise that could be exploited to conduct unauthenticated file operations and even remote code execution.

The vulnerability, tracked as CVE-2026-20253, is rated 9.8 on the CVSS scoring system.

"In Splunk Enterprise versions below 10.2.4 and 10.0.7, an unauthenticated user could create or truncate arbitrary files through a PostgreSQL sidecar service endpoint," Splunk said in an alert this week.

"The vulnerability exists because the PostgreSQL sidecar service endpoint lacks authentication controls, allowing any network-reachable user to invoke file operations without credentials."

The issue has been addressed in the following versions -

  • Splunk Enterprise 10.0.0 to 10.0.6 - Fixed in 10.0.7
  • Splunk Enterprise 10.2.0 to 10.2.3 - Fixed in 10.2.4
  • Splunk Enterprise 10.4 - Not affected

Splunk, which is part of Cisco, said Splunk Cloud is not impacted by the vulnerability as Postgres sidecars are not used in the product.

What the Flaw is All About

On Friday, watchTowr Labs released additional technical details of CVE-2026-20253, stating it could be exploited to achieve pre-authenticated remote code execution on susceptible systems through the "/v1/postgres/recovery/backup" and "/v1/postgres/recovery/restore" endpoints.

The attack chain works as follows -

  • Connect to an attacker-controlled database and dump its contents into an arbitrary file using the /backup endpoint
  • Load the dump of the attacker-controlled database into the local PostgreSQL instance using the /restore endpoint by including a "passfile" argument that specifies the path to a ".pgpass" file ("/opt/splunk/var/packages/data/postgres/.pgpass") containing the password for the "postgres_admin" user
  • SQL queries defined in the database dump will get executed by Splunk's PostgreSQL instance

An attacker could weaponize this weakness to define a new function that uses lo_export - a function used to extract a BLOB from the database and save it as a file on the file system - to write attacker-controlled content to a file, following which the function gets executed during the restoration process.

"At this point, we can authenticate, restore attacker-controlled SQL, and interact with the local database," security researchers Piotr Bazydlo and Yordan Ganchev said. "Once we could restore attacker-controlled SQL into the local PostgreSQL instance, we quickly put together a database dump template that gave us a controlled file write."

Armed with an arbitrary file write primitive on the Splunk file system, an attacker could escalate further to remote code execution by overwriting a Python script that Splunk frequently executes (e.g., "/opt/splunk/etc/apps/splunk_secure_gateway/bin/ssg_enable_modular_input.py") to include the malicious payload.

The entire sequence of actions is below -

  • Create a database and configure it such that a user can authenticate without a password and grant it sufficient permissions to invoke functions like lo_export
  • Use the /backup endpoint to drop a dump of the remote database onto the Splunk file system
  • Use the /restore endpoint to load the malicious database dump, trigger execution of the malicious function during the restore process, and write an attacker-controlled Python script to the Splunk file system

Although there is no evidence of the flaw being exploited in the wild, the availability of the exploit specifics can be enough to drive threat actors to trigger opportunistic attempts. It's essential that users move quickly to apply the fixes to stay protected.



from The Hacker News https://ift.tt/sTrnhiY
via IFTTT

Friday, June 12, 2026

China-Linked Hackers Backdoored Linux Login Software to Hide for Nearly a Decade

Instead of hiding on the laptops and servers defenders watch most closely, a China-nexus group spent close to a decade hidden inside the Linux login system itself.

Sygnia, which tracks the group as Velvet Ant, says it backdoored the PAM and OpenSSH components that decide who is allowed to sign in, planting its access where ordinary cleanup could not reach it. The network it targeted had no direct internet access, so the group first staged through internet-facing systems to get there.

The earliest traces go back to 2016. Instead of dropping new malware that a scanner might catch, the attacker changed the trusted login programs themselves. Nothing obvious appeared, and no exploit was needed, so the activity looked like normal administration.

On many machines, the attacker replaced the main PAM login module with backdoored copies. Some let them in with a secret password; others quietly recorded real usernames and passwords as people logged in.

Researchers found nine separate versions. The OpenSSH programs were altered the same way, logging credentials and every command typed, with a hidden switch to turn that logging off when needed.

Reaching the isolated network at all took extra work. The attacker used other disguised tools and an internet-facing web server as a bridge, passing commands through it to open remote sessions deep inside the segment that had no direct internet access.

Because the login system itself was compromised, normal containment did little. Password resets and killed sessions do not help when the thing that checks those credentials is working for the attacker.

This is not new for the group. Each time defenders find one foothold, Velvet Ant moves to gear they watch less and sets up there. In a 2024 case, Sygnia found the same actor turning internet-exposed F5 BIG-IP appliances into internal command servers.

Later that year, it reported the group exploiting a Cisco NX-OS flaw, CVE-2024-20399, to plant a backdoor on the switches. That bug needs admin access first, so it is a persistence tool, not a remote break-in. Cisco patched it in July 2024, and CISA flagged it as exploited the next day.

Operation Highland is the same idea, one level deeper. Load balancers, switches, and the login software itself are trusted by default and rarely checked, which is exactly why a patient attacker hides inside them.

Operation Highland is not a one-CVE problem. The attacker changed trusted programs after getting in, so the fix is verification, not patching, and cleanup is delicate: a wrong replacement can lock admins out of a live system.

  • Watch the login files. Monitor the PAM and OpenSSH programs and their key files for any change, and alert when they change.
  • Hunt by checking what changed, not by waiting for an alert. Compare these programs against known-good copies, because nothing will flag them for you.
  • Remove the backdoor before resetting passwords, or the new ones get stolen the same way. Test any replacement in a lab first.

The earlier F5 and Cisco cases have their own checks: patch CVE-2024-20399 on Cisco Nexus gear, and watch F5 boxes for unexpected outbound connections.

The wider lesson is plain: infrastructure that sits outside normal monitoring still needs integrity checks, and that now includes the login layer.



from The Hacker News https://ift.tt/glpBnDe
via IFTTT

AI Storage Bottlenecks: Why AI Workloads Slow Down and How to Fix Them

GPU clusters cost serious money whether they’re training or waiting. In most production pipelines, storage constraints idle expensive hardware more often than compute limits do. Slow dataset reads and checkpoint write contention compound across every training run, and fragmented data tiers make the problem worse.

Below, we’ll cover where these bottlenecks show up, how you’ll spot them, and which architectural decisions actually help – though I’ll admit up front that not every fix needs new hardware, and some of the worst bottlenecks are self-inflicted (something I wish more teams understood before they started calling vendors).

What counts as a bottleneck

An AI storage bottleneck is any weak point that prevents workloads from getting data fast enough to keep compute busy. Constraints can sit in throughput, latency, IOPS, metadata performance, or network bandwidth. GPU clusters consume power and cooling whether they’re working or not, so the cost of getting this wrong shows up from the first idle cycle. If you’ve ever watched a utilization graph flatline while the SAN LEDs blink happily, you know exactly what I mean.

AI workloads place different demands on storage than traditional enterprise apps. Training repeatedly reads the same large datasets from many parallel workers. Checkpointing writes hundreds of gigabytes in short bursts. Inference needs fast, predictable access to model weights and cached context. A system that’s tuned for one pattern can fail miserably on the others.

Effects compound fast. You add GPUs and see little improvement because the limit sits earlier in the data path. Sad truth – many teams we’ve worked with plan compute carefully and treat storage as an afterthought.

That ordering produces the exact bottlenecks described here. Every time.

Where bottlenecks appear in the pipeline

Where does storage actually choke? AI workloads span several stages, and each creates a different storage demand.

 

AI pipeline and common bottlenecks

Figure 1: AI pipeline and common bottlenecks

 

The larger your pipeline becomes, the more these demands overlap. In many cases – ingestion, training reads, checkpointing, and backup all compete for the same throughput and network bandwidth at once.

Meta’s work on ML training data illustrates this at scale. The company trains thousands of models on petabyte-scale datasets using Tectonic, its distributed file system, while a separate preprocessing tier handles decoding and conversion to tensor formats. At that scale, storage and network paths are as much a part of AI performance as the training code itself. Maybe more.

Training reads versus inference latency

Training and inference put different demands on storage, so the diagnostic approach depends on which phase is slow.

Training bottlenecks usually appear in dataset reads or checkpoint writes. When an LLM trains across dozens of GPUs, each worker repeatedly reads from the same shared dataset, and if your storage can’t serve those requests concurrently, workers stall, which means your expensive silicon sits there doing nothing while the clock ticks. Checkpoint I/O creates a separate challenge. Model and optimizer state can reach hundreds of gigabytes per save, and research into LLM checkpointing overhead shows this can eat a large share of total training time when storage isn’t built for burst writes.

Inference behaves differently. KV-cache access speed and model loading latency matter most. Research on dual-path KV-cache architectures (DualPath) has shown that agentic LLM inference workloads can become dominated by cache storage I/O rather than compute. A model that runs efficiently once loaded can still deliver poor user-facing latency if storage retrieves cached context slowly.

Symptoms differ. When debugging training jobs, low GPU utilization during dataset loading is the first thing to check. In inference, it’s high first-token latency even when the model itself performs well.

Additionally, enterprise data often arrives scattered across NAS platforms, databases, and cloud buckets before it ever reaches an ML pipeline. Time spent federating and normalizing that data can become the first bottleneck you hit, before any GPU is involved.

Agentic inference introduces another challenge by generating a persistent write stream of tool calls, trace logs, intermediate state, and telemetry. Teams that don’t account for this when planning inference storage frequently see write queues build up under sustained traffic.

Why storage chokes

Slow shared storage serving multiple GPU jobs simultaneously is one of the most common causes of AI performance issues. When several training jobs compete for the same NAS or SAN, throughput per job drops and latency rises. A 10 GbE storage network (still very common in older racks, unfortunately) feeding a cluster that can consume far more bandwidth becomes the ceiling regardless of the hardware behind it, and no amount of flash on the array side fixes a network pipe that’s simply too narrow, which is a lesson teams seem to relearn every budget cycle.

I’ve seen too many teams start with a good old shared NAS because it’s simple to deploy. That works for early experiments. Once several GPU servers read the same dataset in parallel, the NAS controller or network caps the entire cluster, even when the underlying drives still have capacity to spare. It happens.

Data scattered across tiers creates another bottleneck. Excessive movement between NAS, object storage, local disks, and cloud buckets adds latency at every step. A backup job running during an active training window can consume enough bandwidth to measurably reduce GPU utilization. Teams that haven’t isolated these traffic types encounter this pattern regularly. It’s not subtle.

Small-file datasets introduce their own failure mode. A computer vision training job may involve millions of individual JPEG files. On paper, the storage system’s got more than enough throughput. In practice, the file system spends a disproportionate amount of time locating and opening files. Metadata performance becomes the bottleneck. Bandwidth is rarely the issue, and no amount of sequential read optimization fixes it because the problem isn’t the read – it’s the lookup.

How to tell storage is the problem

Check the basics first. The most obvious signal is low GPU utilization during data loading. If your GPUs spend significant time waiting for data instead of processing it, storage or preprocessing is often the real constraint. Long data loader wait times relative to actual compute time are a common indicator. At the infrastructure level, high storage latency, saturated network links, and elevated queue depth usually point to the same problem.

One reliable test is to stage the active dataset on local NVMe storage and run the same training job again. If performance improves significantly, the bottleneck is likely in shared storage or the network path between compute and storage. Checkpoint write duration provides a useful secondary check. Writes that take minutes instead of seconds almost always indicate storage saturation. No exceptions.

Track metrics together. GPU utilization, storage throughput, latency, IOPS, queue depth, network saturation, data loader time, and checkpoint write duration all tell part of the story. Teams identify bottlenecks faster when GPU idle time and storage performance appear on the same dashboard instead of being treated as separate concerns – well, not necessarily on one screen, but correlated, which is different and honestly most monitoring tools don’t do this well out of the box.

Storage architectures for AI workloads

AI pipelines usually need more than one storage layer. A data lake, a hot training tier, a checkpoint target, and an inference path often have different and sometimes conflicting performance requirements. For a detailed breakdown of storage types designed specifically for AI, see AI storage in 2026: types, benefits, and vendors on the StarWind blog.

 

Typical storage layout to keep GPUs busy

Figure 2: Typical storage layout to keep GPUs busy

 

Each architecture solves a different problem, so choosing the right one depends on whether your priority is throughput, latency, scalability, operational simplicity, or cost.

The most common architectural mistake is trying to use one storage platform for every stage of the pipeline. For example, platforms such as WekaFS, VAST Data, and DataCore Nexus are designed specifically for HPC-style access patterns. NVMe storage via NVMe-oF serves the hot tier, whether that means rapid model loading during inference or handling burst checkpoint writes during training.

For edge AI deployments and read-heavy workloads in particular, NVMe-oF makes shared flash accessible across local nodes without requiring a central SAN. StarWind VSAN and DataCore SANsymphony both support this transport layer for compact edge clusters running local inference.

How to reduce AI storage bottlenecks

The right fix depends on where the bottleneck actually exists.

We’ve found that data placement is usually the fastest improvement you can make, and it’s also the cheapest because it often requires nothing more than moving data to a different mount point before the job starts. Stage the active training dataset on local NVMe or a high-throughput shared tier before the job starts, not while it’s already running. Data movement happens once instead of competing with training traffic throughout the run.

For distributed training, avoid routing every GPU worker through a single NAS controller. One overloaded controller caps the entire cluster regardless of what the underlying hardware can do. Parallel file systems and scale-out storage spread the load across multiple nodes and remove that single point of contention.

Checkpoint storage is another area that often receives attention too late, and I’ll admit we’ve missed this ourselves more than once. When checkpoint traffic shares the same path as training reads, training performance usually suffers. Separating checkpoint storage onto its own tier, even a relatively small one, often resolves the problem without requiring a major architectural redesign.

Not every bottleneck requires new hardware. Some just need better configuration. Data loader optimization can be surprisingly effective. Serial file reads and poorly configured loaders create CPU-side delays that look like storage problems but aren’t. Prefetching and parallel loading can significantly reduce wait times. Small-file sprawl is another issue worth addressing early. Packaging datasets into larger shards reduces metadata overhead before it becomes the limiting factor.

Backup traffic deserves special attention as well. Giving backups their own schedule or their own network path is usually more effective than simply adding capacity. More bandwidth doesn’t eliminate contention if competing workloads continue to share the same resources.

Cloud, on-premises, and hybrid AI storage

The right deployment model depends on your workload requirements, data sensitivity, and how much latency your applications can tolerate.

Cloud environments work well for burst training and short-lived experiments. You can provision compute close to managed storage, complete the training run, and release the resources afterward. The issue surfaces at scale. Egress costs for multi-terabyte training datasets can match the compute cost of the training job itself, and staging data near cloud compute before each run is usually more practical than treating storage and compute placement as independent decisions. Painful, but predictable.

On-premises infrastructure remains a strong fit for organizations with sensitive datasets and predictable workloads. For example, a healthcare team keeping regulated imaging data close to local GPU resources avoids both compliance concerns and the cost of repeatedly moving large datasets to the cloud. The same organization may still use cloud GPUs for less sensitive experimentation while keeping primary datasets on-premises.

Edge deployments address situations where round-trip latency to a central location is unacceptable. Storage and compute stay together. Full stop.

Hybrid architectures are where many organizations I’ve talked to usually land. Cold data resides in object storage, active datasets are staged close to GPU clusters, and cloud resources absorb temporary demand spikes. Managing data movement between tiers without introducing new bottlenecks is the hard part.

HCI and software-defined storage at the edge

HCI and software-defined storage are not the primary answer for hyperscale AI training, but they fit several adjacent use cases well. Edge inference and local data preparation are sweet spots, as are compact clusters where operational simplicity matters. Hyperscale training is not.

Consider a factory running local inference on production-line camera feeds. Sending every request to a centralized data center introduces unnecessary latency and creates a dependency on WAN connectivity. A compact hyperconverged cluster keeps compute and storage in the same environment, eliminating the need for a separate storage network. If you’re designing edge AI infrastructure, edge storage deserves consideration as its own architectural category.

StarWind HCI Appliance and StarWind VSAN are designed around this model. They support two-node and small-cluster deployments where compute and storage share the same hardware, high availability remains local, and there’s no dependency on a centralized storage network. StarWind VSAN also provides software-defined fault tolerance without requiring a dedicated witness node. We’ve found this particularly valuable at remote sites where every additional server adds cost and operational overhead, and where shipping a replacement part might take days.

Common AI storage mistakes

Most AI storage problems are predictable. In fact, we see the same mistakes repeatedly, regardless of team size and budget, or the sophistication of the models involved.

Buying GPUs before checking storage throughput is probably the most common. The GPUs arrive, the cluster is deployed, and only then does the team discover that the existing storage system can’t feed them at the required rate. The hardware budget is spent. Not on the actual bottleneck.

Testing with generic benchmarks creates false confidence. Sequential read tests pass, while checkpoint writes and small-file handling still fail. AI-specific workload testing should be part of storage validation before any major hardware investment is made.

Treating object storage as a universal hot tier is a recurring mistake in first-generation AI environments. Object storage scales extremely well for data lakes and archives. It also handles large repositories without issue. But active training workloads typically require lower and more predictable latency than S3-compatible storage can provide. Over long training runs and repeated dataset scans, that gap becomes increasingly visible.

No monitoring of GPU wait time means teams notice slow runs but can’t locate the cause. GPU idle cycles tied to data loading are the most actionable signal of a storage bottleneck, and the metric most commonly missing from AI infrastructure dashboards.

What to check before your next GPU purchase

Storage is rarely the first thing teams investigate when AI jobs run slowly, but it’s frequently where the actual limit sits. Many storage bottlenecks can be resolved without buying additional hardware. Start there.

Before we buy our next GPU, we run a simple test. Stage your dataset on local NVMe, watch the utilization graph, and compare it to your shared storage baseline. If the gap is wide, you don’t have a compute problem. You have a plumbing problem. Fix the storage first. The GPUs can wait. They’re already good at that.

FAQ

What is an AI storage bottleneck?

An AI storage bottleneck is any limitation in throughput, latency, IOPS, metadata performance, or network bandwidth that prevents workloads from receiving data fast enough to keep compute resources fully utilized.

Why do GPUs sit idle during AI training?

GPUs typically sit idle when the data pipeline cannot deliver training samples quickly enough. Common causes include slow shared storage, saturated network links, inefficient data loaders, or datasets that have not been staged close to compute resources.

What storage is best for AI workloads?

Hot training data benefits from NVMe or a parallel file system. The data lake and cold datasets suit S3-compatible object storage. Checkpoints need a tier that absorbs burst writes. The right design is tiered, matched to each pipeline stage.

Is object storage good for AI?

Yes, but it depends on the workload. Object storage works well for AI data lakes, backups, archives, and long-term dataset repositories. It is generally less effective as the primary hot training tier unless additional caching or staging layers are used.

Is NVMe required for AI storage?

Not always, but it is the fastest option for hot datasets, checkpoint writes, and model loading. Many teams use NVMe as a local staging tier with colder data in NAS or object storage behind it.

What is the difference between AI storage for training and inference?

Training needs high sustained throughput for dataset reads and burst write capacity for checkpoints. Inference needs low latency for model loading, KV-cache access, and embedding retrieval.

How do I know if storage is slowing down my AI workloads?

Start by monitoring GPU utilization during data loading and correlating it with storage latency, throughput, and network utilization. A simple validation test is to move the dataset to local NVMe storage and rerun the workload. If performance improves significantly, storage or the network path is likely the bottleneck.

Can HCI help with AI storage bottlenecks?

For edge AI, local inference, and smaller training clusters, yes. For large-scale distributed training, dedicated high-throughput storage is usually more appropriate.

What storage metrics matter for AI workloads?

GPU utilization, storage throughput and latency, IOPS, queue depth, network saturation, data loader time, and checkpoint write duration.



from StarWind Blog https://ift.tt/f8qBa7d
via IFTTT

UniconOS Management Cloud: A simpler way to manage enterprise endpoints without infrastructure overhead

Enterprise IT leaders often recognize it, even if they don’t call it out: an endpoint program grows—a new region, a new business unit, another wave of devices—and what started as a focus on policy, lifecycle, and user experience subtly shifts. The conversation moves from endpoints to the platform behind them: how it runs, scales, stays available, and who owns the risk.

At this point, endpoint management can feel less like a strategic capability and more like an infrastructure project. The cost is real—not always on a budget line, but in time, operational drag, and missed opportunities. Rollouts slow, growth plans become sizing exercises, and reliability turns into a governance discussion. All at a time when the business expects IT to move faster, not become more operationally burdened.

The strategic question is no longer whether you can manage endpoints. It is whether managing the management platform itself is where you want to focus your best resources.

This is exactly the problem UniconOS Management Cloud, formerly Scout, is designed to solve, bringing a cloud‑hosted operating model to enterprise endpoint OS management without sacrificing control.

The operating model is the real decision

Some organizations deliberately run platforms internally; full control, tailored governance, and internal ownership are part of their DNA. Others want endpoint management to behave like a service: reliable, scalable, and predictable – without turning every growth step into another infrastructure initiative.

In today’s environment, where IT is measured on speed and resilience, operating model decisions are resource decisions. Where should your best people focus: maintaining platform infrastructure or improving endpoint outcomes, security posture, and user experience?

This is not a technical question. It is a strategic one.

Extending Citrix cloud to endpoint OS management

Citrix continues to simplify how customers consume secure digital workspaces. Extending that philosophy to endpoint OS management is a natural next step.

With Citrix UniconOS, customers get an endpoint OS platform built for Citrix environments.

Citrix UniconOS Management, available in Local and Cloud deployment models, provides the management layer for UniconOS, handling policies, configuration, and visibility.

  • The Local deployment model is customer-managed, giving organizations full control over infrastructure, scaling, and availability.
  • The Cloud deployment model is Citrix-managed and reduces operational overhead while scaling.

What’s new is the ability to choose between these two deployment models: organizations can run UniconOS Management in Local mode or adopt the Cloud deployment model, where Citrix operates the underlying platform services in a managed cloud environment. This is a deliberate choice based on how much operational responsibility teams want to carry.

The strategic cost of endpoint management infrastructure

Endpoint programs are growing faster and more complex: more locations, device types, distributed teams, and higher security expectations. Each expansion introduces operational risk, and every delay slows business outcomes.

Running the underlying platform is increasingly specialized. When IT teams spend significant time maintaining infrastructure instead of managing endpoints, they trade strategic focus for operational overhead. It’s a tradeoff most executive teams are trying to reduce.

Infrastructure that does not directly create business value should not become a scaling tax. UniconOS Management Cloud reduces platform operations overhead, allowing IT to focus on what drives real impact: faster deployments, secure and reliable endpoints, and smoother user experiences – without sacrificing control where it matters most.

Four strategic outcomes of Citrix UniconOS Management Cloud

1. Flexibility
Choose the operating model that aligns with governance, regulatory, and organizational requirements. Customer-managed remains fully supported; Citrix-managed is available when simplicity and cloud operations are the priority.

2. Scale without friction
Endpoint growth should not trigger new platform projects. The management layer scales with device counts without repeated infrastructure redesign.

3. Enterprise by design
Availability, monitoring, backup, and resilience are built in. Reliability is foundational, not an afterthought.

4. Accelerated time-to-value
Move from evaluation to rollout without lengthy infrastructure preparation. Removing platform operations as a bottleneck lets teams focus on delivering endpoint outcomes faster.

Getting started with Citrix UniconOS Management Cloud

Exploring a managed operating model does not require a large-scale transformation. A focused proof of concept allows teams to validate outcomes that matter most—time-to-value, scalability, availability, and reduced operational overhead—before broader rollout decisions.

Local mode remains the right choice for many customers. The Cloud mode exists for organizations that want to reduce operational burden while maintaining clarity, control, and enterprise standards.

Endpoint strategy should never be constrained by platform operations. With UniconOS Management Cloud, Citrix customers gain the freedom to decide where control truly matters – and where simplicity drives value. The choice of operating model is ultimately a choice about focus.

Next Steps

Start with a focused proof of concept to validate time-to-value, scalability, and operational overhead. Contact your Citrix representative to explore your options.



from Citrix Blogs https://ift.tt/LTke24I
via IFTTT

The Good, the Bad and the Ugly in Cybersecurity – Week 24

The Good | Authorities Dismantle Crypto Laundering Empire & Seize Espionage Domains

Europol has dismantled a major cryptocurrency laundering network called “AudiA6”, known for actively facilitating illicit transactions for ransomware syndicates and cybercriminals worldwide. Since 2022, the platform allegedly laundered more than $380 million by obscuring the origin of cybercrime proceeds through complex transaction routes for a 3-10% service commission. The joint operation, spanning 11 countries and supported by Eurojust, successfully seized multiple domains and froze a substantial amount of AudiA6’s digital assets.

Following forensic analysis stemming from a prior arrest in Poland, investigators were able to identify and apprehend the platform’s two senior administrators in Georgia. The industrial-scale infrastructure relied on thousands of fraudulent exchange accounts, all registered by recruited money mules using stolen identities. The suspects, who also managed the “Dark2Web” cybercrime forum, now face potential 20-year prison sentences for operating the illicit service.

The FBI has seized 13 fraudulent websites operated by suspected Chinese intelligence agents attempting to recruit U.S. citizens holding sensitive government security clearances. The campaign used AI-generated photographs and stolen identities to construct fake consulting firms that advertised generic analyst and consultant roles across major professional networking platforms including Upwork, HUbstaff Talent, and Wellfound.

When targets applied, operatives then pressured the candidates to disclose confidential or non-public information in exchange for lucrative compensation. To obscure their identities and the origin of funds, the recruiters used cryptocurrency and online payment systems.

Federal authorities have now successfully identified and dismantled the network after several targeted individuals reported the suspicious payment methods to investigators. Officials continually urge current and former government personnel to exercise extreme caution regarding unsolicited recruitment offers promising easy income for vague consulting work.

The Bad | JDY Botnet Expands Scope to Target U.S. Military Networks for Cyber Reconnaissance

A malware network previously associated with PRC-based threat groups like Volt Typhoon is expanding its cyber reconnaissance operations and target scope. Known as “JDY botnet”, the network has grown rapidly from approximately 650 active bots in early 2024 to over 1,500 compromised small office/home office (SOHO) and Internet of Things (IoT) devices today. While operators maintain a global footprint, they are now heavily concentrating efforts within the United States, specifically focusing on the military and its associated networks.

Unlike traditional distributed denial-of-service (DDoS) botnets, JDY functions primarily as a distributed scanning and fingerprinting network. Operators weaponize the network to quickly locate vulnerable infrastructure immediately following public vulnerability disclosures.

The malware then registers with a central dispatch service hosted on hidden Tor networks to receive scanning assignments. Once deployed on compromised edge devices, including hardware from Cisco, Ubiquiti, and Hikvision, the botnet executes comprehensive service discovery, service banner grabbing, TLS certificate collection, and protocol fingerprinting. When it has enough administrative privileges, JDY performs exceptionally fast and stealthy SYN scanning using custom-crafted TCP packets to batch-process thousands of potential targets.

A snippet of the JDY malware dropper that downloads and executes the malware (Source: Black Lotus Labs)

Federal agencies previously warned about the risks to unprotected routing infrastructure. To prevent hardware from being recruited into these vast reconnaissance networks, administrators must consistently ensure all edge devices run the latest security patches. Organizations can proactively reduce their external attack surfaces by disabling unnecessary internet-exposed management interfaces, fully replacing default administrative credentials, and thoroughly monitoring for any unusual outbound scanning activity originating from local networks.

The Ugly | Miasma Supply Chain Worm Continues Propagation Across Microsoft & PyPI Repositories

The ongoing Miasma self-replicating supply chain worm recently compromised 73 Microsoft GitHub repositories, including projects related to Azure, prompting GitHub to rapidly disable access. An evolution from the “Mini Shai-Hulud” malware, threat actors are now directly pushing malicious configuration files into legitimate source repositories.

The hidden payloads automatically trigger code execution whenever developers open the compromised projects using popular AI coding assistants or integrated development environments (IDEs). The latest intrusions most notably involve the re-compromise of the “durabletask” PyPI package, indicating attackers retained previously stolen developer credentials to seamlessly propagate the worm through automated contributor workflows.

Miasma continues to infect more packages on GitHub (Source: TheHackerNews)

Since the series of Microsoft repo breaches, the campaign has evolved into a fresh attack wave dubbed “Hades”, actively targeting the PyPI registry. Attackers poisoned 19 PyPI packages with malicious wheel artifacts containing hidden .pth setup files. This mechanism executes silently during Python interpreter startup, entirely eliminating the need for victims to explicitly import the compromised packages.

The payload then downloads the standalone Bun JavaScript runtime to evade traditional network proxies, subsequently deploying a heavily obfuscated credential stealer. The malware aggressively harvests cloud access tokens, SSH keys, shell histories, and Docker configurations while introducing new, tailored memory scrapers specifically targeting macOS and Windows environments.

Advanced in its defensive evasion, the Hades variant incorporates novel plain-text prompt injections deliberately designed to deceive LLM-based package analysis tools into incorrectly classifying the malicious packages as safe.

Ultimately, these cascading supply chain attacks successfully exploit fundamental trust models within open-source ecosystems, leveraging compromised, authenticated maintainer accounts to embed persistence mechanisms directly into standard developer environments.



from SentinelOne https://ift.tt/0awiN9H
via IFTTT

Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code

Cybersecurity researchers have described what they say is a new class of attack that can trick artificial intelligence (AI) coding agents into running arbitrary code on developer machines.

Called Agentjacking by Tenet Security, the attack can be triggered by means of a fake error report crafted using Sentry, an open-source error-tracking and performance-monitoring platform.

"The attack exploits a critical architectural flaw at the intersection of Sentry's event ingestion (which accepts arbitrary payloads from anyone with the DSN) and the Sentry MCP server (which returns this data to AI agents as trusted system output)," security researchers Ron Bobrov, Barak Sternberg, and Nevo Poran said.

The idea is to inject crafted input into Sentry error events, which are then interpreted by coding agents like Claude Code and Cursor as legitimate diagnostic resolution steps and run attacker-controlled code.

A successful attack of this kind can expose sensitive data, including environment variables, Git credentials, private repository URLs, and developer identities, without having to rely on methods like phishing or prior server compromise.

The problem is rooted in the implicit trust associated with connecting to external services using Model Context Protocol (MCP). Because an AI agent is unable to distinguish between an error event generated by a real application crash or injected by an attacker, it creates a pathway to arbitrary code execution when the agent processes the response.

The attack chain devised by Tenet is as follows -

  • An attacker finds a target's Sentry Data Source Name (DSN), a public, write-only credential that's embedded in websites.
  • The attacker sends a malicious error event to Sentry's ingest endpoint via a POST request using the DSN.
  • The injected event contains "carefully formatted markdown" in the message field and context key names. When the Sentry MCP server returns this event to an AI agent, it is rendered as structured content visually identical to the Sentry's system template.
  • When a developer asks their AI coding agent to "fix unresolved Sentry issues" (or a similar prompt), the agent queries Sentry via MCP and receives the malicious event.
  • The agent executes malicious code, which runs with the developer's full privileges.

"The attacker never touches the victim's infrastructure," the researchers explained. "The malicious instruction arrives disguised as a legitimate 'Resolution' inside an ordinary error. When a developer asks their AI agent to fix the Sentry issue, the agent reads the attacker's command as trusted guidance and runs it - with the developer's own privileges, on the developer's own machine."

Agentjacking stands out because it targets the AI agent a developer trusts and uses a Sentry DSN as a starting point. In addition, the markdown injection is rendered such that the agent cannot distinguish it from legitimate Sentry guidance.

The AI cybersecurity company said it found at least 2,388 organizations exposed with valid injectable DSNs, and that it tested the attack in a controlled manner against over 100 organizations, achieving an 85% exploitation success rate against injected errors across some of the most widely used AI coding assistants.

Sentry, for its part, has acknowledged the issue, but opted not to fix it, stating it's "technically not defensible." However, the company is said to have activated a global content filter that blocks a "specific payload string."

"As enterprises race to deploy AI coding agents, this research proves the agents themselves are now the attack surface - turned against the developers who trust them, using nothing but data those organizations publish about themselves," Tenet said. "The attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls - because there is nothing malicious to detect. Every action in the chain is authorized."



from The Hacker News https://ift.tt/VE0b4Wf
via IFTTT

Rethinking MDR as Attackers and Defenders Embrace AI

For most of the past decade, managed detection and response was the answer to a real problem. Security teams couldn't staff around the clock, couldn't hire enough analysts, and needed someone else to handle the alert queue. MDR stepped in. It worked well enough. Until now.

The threat landscape has changed faster than the MDR model can adapt. Attackers are using AI to move faster, generate more convincing phishing at scale, automate reconnaissance, and create malware variants that evade signature-based detection. The attack surface has expanded from endpoint to cloud, identity, and network simultaneously. And yet MDR is still doing what it always did. Routing alerts to human analysts who triage what they can, in the order they can get to it.

That is no longer enough. The data we share below proves it and security leaders might consider exploring whether they have outgrown their MDR.

MDR's 24/7 promise doesn't cover 60% of your alerts

MDR promised 24/7 human coverage. What it delivered was a 24/7 human capacity to triage high-severity alerts. Those are not the same thing.

Across the industry, approximately 60% of alerts go unreviewed. That's not a performance failure. Human teams, whether in-house or outsourced to an MDR, cannot process the volume of alerts that modern environments generate. So they do what any rational person does. They prioritize. P1s and P2s get worked. P3s and P4s pile up.

But this is exactly where attackers hide.

Analysis of 25 million alerts across global enterprises in 2025 found that nearly 1% of real threats originate in low-severity and informational alerts. In an enterprise generating 450,000 alerts annually, that translates to roughly 54 real incidents per year, about one per week, sitting in the deprioritized queue where no one is looking.

The breaches hiding in that backlog are not theoretical. They are happening right now, in organizations that believe they have coverage.

Note: The math behind the above statement assumes 450K annual alerts, of which 60% are not investigated and of those, 2% are real incidents. Of those real incidents, 1% originate in low-severity alerts.

Investigation quality varies by who is on shift

Even for alerts that do get reviewed, MDR investigation quality is not consistent. It is bounded by the experience of the analyst on duty, the queue depth at that moment, the time of day, and whether the team is fully staffed. A P1 at 3 am gets a different investigation than the same alert at 10 am.

This is not a criticism of MDR analysts. It is a description of what happens when any human-executed process runs at high volume, under pressure, around the clock. Variance is unavoidable.

The consequences are real. When an investigation is shallow, threats get classified as noise. When follow-through is inconsistent, early-stage lateral movement looks like routine behavior. The attacker who got in on a low-severity alert keeps moving undetected because no one had the time or context to connect the signals.

Detection engineering is not a closed loop

In most MDR deployments, detection engineering is a periodic exercise. Rules get tuned when customers complain about alert volume. New coverage gets added when a major CVE makes news. Otherwise, the detection posture drifts.

The core problem is architectural. MDR investigation and detection engineering operate in separate silos. When an analyst investigates an alert and closes it as a false positive, that insight rarely feeds back into the detection system. Broken rules stay broken. Noisy rules keep generating noise. New attacker techniques arrive without matching detections.

The result is a detection posture that degrades faster than it improves. Real coverage, measured against the MITRE ATT&CK framework, can be far lower than teams assume.

You can't audit what you can't see

Most MDR services are a black box. Customers receive escalations and summaries. They do not get to see the investigation logic, inspect the evidence trail, verify the verdict, or audit what the analyst actually reviewed before closing a case.

In an era where accountability and transparency are security requirements, this is a genuine liability. When an incident is missed, you cannot diagnose why. When a verdict is wrong, you cannot trace the reasoning. When regulators ask what was investigated and how, there is no answer.

The AI savings are going to the vendor, not to you

AI is reducing the operational cost of MDR. Providers are using it to automate portions of triage, reduce analyst hours, and increase margins. Those efficiency gains do not flow through to customers as lower prices or expanded coverage. The buyer still pays the same rate, or more. The provider keeps the savings.

But the coverage gap stays the same. The human scaling constraint stays the same. Only the provider's cost structure has improved.

You don't own what was built in your name

Detection rules, triage logic, case history, and investigation learnings accumulate inside the MDR vendor's platform over the life of the contract. When the contract ends, that knowledge does not move with you. The years of tuning, the accumulated context about your environment, and the detection improvements built from your data all stay with the vendor.

This creates two problems. First, organizations that switch providers start from scratch, rebuilding institutional knowledge that took years to develop. Second, organizations that want to bring security operations in-house, a trend that is accelerating as AI SOC tools mature, find themselves starting with no foundation.

MDR providers, for obvious reasons, are not incentivized to help customers build internal capability. Their model depends on retaining the work.

Your MDR contract may block you from using Claude for your SOC

The above-mentioned knowledge lock-in is no longer just a switching-cost problem. It's also an AI readiness problem. When you try to deploy an AI agent for SOC work, it needs a knowledge foundation to reason over. Detection rules, case history, behavioral baselines, and forensic verdicts. If those live in your MDR vendor's platform, your agent is starting from near zero.

Additional MDR gaps worth noting

Aside from the above, MDR has a set of smaller gaps that compound over time. Every customer gets the same generic playbook regardless of their specific risk profile, compliance obligations, or data sensitivity. Integration tools like SOAR, which were supposed to streamline MDR findings into internal workflows, largely failed to deliver on that promise because human-driven investigation doesn't produce the structured, consistent outputs that automation requires. And when a real incident surfaces and a customer needs to talk to someone who understands their environment, they often reach an AI chatbot or a ticketing queue instead of a person.

What the AI-powered attacker era actually requires

The attackers of 2026 are not waiting for alert queues to clear. AI-generated phishing campaigns hit inboxes at a volume and quality that bypass conventional gateways. Credential stealers like Agent Tesla and LummaC2 move fast. EDR tools are being actively evaded, with research showing that more than half of confirmed compromised endpoints had already been marked as "mitigated" by the EDR vendor. The attacker has already won a round that the defender didn't know was being played.

Meeting this moment requires a different operating model. One where investigation speed is measured in seconds, not hours. Where every alert gets examined, regardless of severity or time of day. Where the output is an evidence-backed verdict, not an analyst's judgment call under pressure.

This is what an AI SOC is designed to deliver.

An operating model shift where AI executes and humans supervise

The core idea behind an AI SOC is simple. Move investigative execution out of the human queue and into AI, so that humans can focus on decisions rather than discovery.

In practice, this means 100% of alerts, including endpoint, identity, cloud, network, phishing, and SIEM, are triaged and investigated automatically. Not sampled. Not filtered by severity. All of them. The AI applies the same forensic depth to a P4 alert at 3 am that a senior analyst would apply to a P1 in the afternoon.

Intezer's platform data across 25 million alerts shows this is achievable. Less than 2% of alerts required human escalation. The over 98% that resolved autonomously did so with sub-minute median triage time and 98% verdict accuracy. For a large enterprise with 450K annual alerts, that means roughly 441K alerts per year are fully investigated and resolved without human intervention and 54 genuine threats that would have been missed under traditional MDR coverage are now caught with actional remediation recommendations.

Forensic depth is what makes AI autonomy trustworthy

AI can summarize an alert. That's useful. AI can enrich with threat intelligence. Also useful. But neither of those activities is investigation. They are pre-processing.

Genuine AI-driven investigation requires forensic-level interrogation. When an alert fires, the question is not "does this look suspicious?" It is, what actually executed, where did it originate, what did it do, and is there evidence of compromise in memory that the alert itself didn't surface?

This matters because the most dangerous threats are specifically designed to evade surface-level detection. Fileless malware lives entirely in memory and writes nothing to disk. Code injection hides inside legitimate processes. Early-stage credential theft looks like normal authentication. Without memory forensics, binary analysis, and code reuse detection, an AI investigation is only as deep as the alert data it was handed.

Forensic depth is also what creates the trust threshold, the point at which AI verdicts are accurate and evidence-backed enough to act on without human validation. Below that threshold, AI assists analysts. Above it, AI can safely take on the full investigative workload and escalate only when evidence warrants it.

Closed-loop detection engineering changes everything

One of the most significant structural advantages of a true AI SOC is the closed loop between investigation and detection. Every alert investigation surfaces information about detection quality. Which rules are firing accurately, which are generating noise, and which attacker techniques have no coverage at all?

When this feedback flows continuously into detection engineering, the posture improves without waiting for an annual audit or a customer complaint. Noisy rules get tuned. Broken telemetry gets flagged. New coverage for emerging techniques gets deployed in days, not months. The detection system gets smarter alongside the investigation system.

This is how MITRE ATT&CK coverage moves from a static baseline to a dynamic, improving map of what an organization can actually detect. It is the difference between coverage that reflects what was set up two years ago and coverage that reflects what attackers are doing today.

Pricing that aligns with full coverage

The economics of an AI SOC should match the coverage it provides. Per-alert pricing, still common among AI copilot tools that rely heavily on LLMs, forces customers to be selective about which alerts to send. The result is the same cherry-picking problem that MDR created. High-severity alerts get the attention, low-severity alerts accumulate in a deprioritized queue.

Per-endpoint pricing changes this entirely. The cost is fixed to the number of monitored endpoints, not to alert volume. There is no economic penalty for investigating every alert. Full coverage becomes the default, not a premium option.

This also matters for budget predictability. Alert volumes spike unpredictably during active incidents or when new detections deploy. Endpoint counts are stable. For finance teams trying to plan security spend, the difference is significant.

What ownership looks like under an AI SOC

Detection rules, investigation history, and organizational context should belong to the organization, not to the vendor. This means every detection deployed to a customer's SIEM is the customer's rule. Investigation evidence is available for audit at any time. If the organization decides to expand internal capability, build its own AI agents, or switch tools, they take everything with it.

This is not just a contract term. It is a prerequisite for security maturity and for broader adoption of AI tools like Claude for your security team. Organizations that want to eventually supervise AI systems rather than outsource to vendors need a knowledge foundation to build on. That foundation cannot exist if it lives inside a vendor's platform.

The transition from MDR to AI SOC

Moving from MDR to an AI SOC is not necessarily a rip-and-replace decision for most organizations. The practical path might be augmentation first. Bring in an AI investigation alongside the existing MDR contract, observe what the AI surfaces that the MDR was missing, and let the comparison build the case for a clean transition at renewal.

By the time the MDR contract is up for renewal, the organization typically has months of evidence showing what full alert coverage looks like, what the escalation rate was under AI triage, and what it would cost to maintain the old model versus the new one. The decision is no longer theoretical.

The question security leaders need to answer

The MDR model was designed for a world where attackers operated at human speed, and the primary challenge was staffing coverage. That world is gone. Attackers are running AI-assisted campaigns, moving through environments faster than human triage queues can respond, and specifically targeting the low-severity signal space where MDR leaves blind spots.

The question for every CISO and security leader evaluating their current operations is straightforward. Of the 60% of alerts your team isn't reviewing, how confident are you that none of them contain a real threat?

The answer, informed by Intezer's analysis of 25 million real alerts, is that roughly 54 of them do. Every year. One per week. In the pile that no one is looking at.

The AI SOC doesn't promise to eliminate all threats. No platform does. But it closes the coverage gap that the MDR model structurally cannot. Every alert, every severity, every hour of the day, is investigated with forensic depth, in under a minute. That is what security operations in the AI era look like.

Found this article interesting? See the 2026 MDR renewal checklist by Intezer.

Found this article interesting? This article is a contributed piece from one of our valued partners. Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.



from The Hacker News https://ift.tt/j04YsB3
via IFTTT

Thursday, June 11, 2026

A tale of two eras

A tale of two eras

Welcome to this week’s edition of the Threat Source newsletter. 

To the surprise of absolutely no one who has seen my face, I’m one of the younger employees at Talos. As my industry veteran colleagues were buying the first iPods, navigating the switch from dial-up to broadband, saying goodbye to floppy disks, and making Myspace accounts, I was playing with my Password Journal and Friend Chips. It’s a funny contrast, but I still experienced the beginning of the “always-on” era. 

Ah, those were the days. One of my most vivid tech memories is begging my dad to play games on his Handspring Visor — a classic personal digital assistant (PDA) launched in late 1999 by Handspring, a company formed by the original creators of the PalmPilot. Handspring stopped producing the Visor line in 2002 and it eventually became obsolete, mostly because its desktop sync feature couldn't keep up with modern OS updates. Despite the tech debt, I spent hours playing Asteroid, Centipede, and Hardball (aka Breakout) on that thing. My dad, meanwhile, mostly used the Memo function to store his passwords... which he still does today. (Yeah, I’m still working on getting him to see the wonders of 1Password.) 

A tale of two eras

You might be wondering what made me reminisce on childhood toys. A few weeks back, my fiancée and I drove a few hours to visit my family. Even if we get in at 9:00 p.m., it’s tradition for us to stay up late eating pizza and talking about random stuff. 

We got on the topic of phones because my parents still have a landline, and I mentioned that walkie talkies were my first introduction to having my own personal device. My dad dug some old ones out, set them on the table, and put them on scan while we chatted.  

At some point, the conversation petered out just when the walkie talkie captured a channel. Radio static, and then a kid’s voice broke our silence: “Your butt crack is out.” 

My dad got an impish grin and brought the talkie up to his mouth. My mom pleaded, “No. Honey, no. Don’t.” The rest of us were already wheezing and crying. 

He pressed the talk button and, in his best crotchety old man voice, bellowed, “Hey, you kids. Get off my lawn!” 

Imagine being those poor kids. It’s a funny story, but if you don’t want people like my dad intercepting your comms, maybe stick to encrypted channels. 

The one big thing 

Talos' Yuri Kramarz published a blog highlighting how AI-driven vulnerability discovery has completely outpaced human patching capabilities. With frontier AI models autonomously discovering and exploiting zero-days in minutes, the traditional vulnerability lifecycle has completely collapsed. To survive this hyper-accelerated threat environment, organizations must abandon patch-reliant strategies and embrace a three-stage fallback model built on foundational security principles. 

Why do I care? 

Speed is the new, terrifying multiplier in the traditional risk equation. When an AI can uncover a decades-old zero-day and write an exploit for it in minutes, relying solely on vulnerability management is a losing game. Defenders must accept that some exploitation will inevitably slip through the cracks. The true measure of security is no longer just prevention, but how well your environment can absorb, detect, and survive the initial blow. 

So now what? 

Stop treating security basics like optional compliance checkboxes. Enforce multi-factor authentication (MFA) everywhere, harden devices using CIS benchmarks, and implement strict network segmentation to limit an attacker's blast radius. Since hardened systems only slow attackers down, deploy behavioral-based EDR, NDR, and XDR to catch the post-exploitation activity that signatures miss. Finally, validate these controls through penetration testing and purple team exercises so your incident response playbooks become muscle memory, not just wishful thinking. Read the full blog for more. 

Top security headlines of the week 

CISA gives U.S. federal agencies three days to fix a VPN bug under attack by Qilin 
Check Point Software said the bug affects several of its remote access tools, firewalls, and VPNs, which act as digital gatekeepers to protect company networks from unauthorized access. (TechCrunch

Anthropic launches Claude Fable 5: Mythos-class AI with cybersecurity guardrails  
The AI giant says this marks the first time a model of this capability class has been deemed safe enough for widespread public and developer access. (SecurityWeek

Microsoft fixes two high-severity zero-days disclosed by researcher 
The vulnerability is a local privilege escalation, meaning it can be chained to a separate vulnerability to give users or processes with low-level privileges the ability to defeat OS protections and gain full SYSTEM rights needed to install malware. (Ars Technica

WhatsApp catches spyware firm NSO defying no-hacking court order 
According to WhatsApp, the spyware maker has violated the permanent injunction. The messaging app reported on Monday that it had recently learned of a social engineering attack that attempted to trick users into clicking on malicious links. (SecurityWeek

High-severity vulnerability in Linux caused by a single faulty character 
The presence of a single mis-issued exclamation point in code implementing nf_tables introduced a use-after-free, a class of vulnerability that corrupts memory by placing malicious code at memory addresses that haven’t been properly freed of their previous contents. (Ars Technica

Can’t get enough Talos? 

Hypotheses, telemetry, and human judgment: Inside Cisco Talos Threat Hunting 
Learn how Cisco Talos Threat Hunting uses hypothesis-driven methods and multi-domain telemetry correlation to find stealthy threats operating below automated detection thresholds. 

Winning the cyber marathon with Tony Giandomenico 
In the high-speed world of cybersecurity, the difference between a breach and a breakthrough often comes down to endurance. Tony Giandomenico, Senior Director of Product Management with Cisco Talos, joins me to discuss Talos Threat Hunting, the challenges of leading major product launches, and the grueling discipline of Ironman triathlons. 

When synthetic logs don’t lie: Generating coherent attack stories for better detection 
Are your detection rules failing because your test data lacks the nuance of a real-world network?  In this episode of Talos Takes, Amy sits down with David Bianco to discuss why traditional synthetic data often falls short and how his new open-source project, EvidenceForge, is changing the game. 

Upcoming events where you can find Talos 

Most prevalent malware files from Talos telemetry over the past week 

SHA256: 9f1f11a708d393e0a4109ae189bc64f1f3e312653dcf317a2bd406f18ffcc507  
MD5: 2915b3f8b703eb744fc54c81f4a9c67f  
Talos Rep: https://talosintelligence.com/talos_file_reputation?s=9f1f11a708d393e0a4109ae189bc64f1f3e312653dcf317a2bd406f18ffcc507  
Example Filename: VID001.exe  
Detection Name: Win.Worm.Coinminer::1201** 

SHA256: 96fa6a7714670823c83099ea01d24d6d3ae8fef027f01a4ddac14f123b1c9974  
MD5: aac3165ece2959f39ff98334618d10d9  
Talos Rep: https://talosintelligence.com/talos_file_reputation?s=96fa6a7714670823c83099ea01d24d6d3ae8fef027f01a4ddac14f123b1c9974 
Example Filename: d4aa3e7010220ad1b458fac17039c274_63_Exe.exe  
Detection Name: W32.Injector:Gen.21ie.1201 

SHA256: a31f222fc283227f5e7988d1ad9c0aecd66d58bb7b4d8518ae23e110308dbf91 
MD5: 7bdbd180c081fa63ca94f9c22c457376 
Talos Rep: https://talosintelligence.com/talos_file_reputation?s=a31f222fc283227f5e7988d1ad9c0aecd66d58bb7b4d8518ae23e110308dbf91 
Example Filename: d4aa3e7010220ad1b458fac17039c274_62_Exe.exe 
Detection Name: Win.Dropper.Miner::95.sbx.tg** 

SHA256: 9896a6fcb9bb5ac1ec5297b4a65be3f647589adf7c37b45f3f7466decd6a4a7f 
MD5: 38de5b216c33833af710e88f7f64fc98 
Talos Rep: https://talosintelligence.com/talos_file_reputation?s=9896a6fcb9bb5ac1ec5297b4a65be3f647589adf7c37b45f3f7466decd6a4a7f 
Example Filename: sample.exe  
Detection Name: Win.Tool.Procpatcher::1201 



from Cisco Talos Blog https://ift.tt/nLwlQ32
via IFTTT

Terraform MCP server is now generally available

Terraform MCP server enables AI assistants like GitHub Copilot, IBM Bob, Claude Code  etc. to interact with Terraform through the Model Context Protocol (MCP). By connecting AI to your infrastructure workflows, teams reduce manual effort, eliminate context switching between tools, and accelerate delivery without compromising security.

Today, we're announcing the general availability of Terraform MCP server, now available for both HCP Terraform and Terraform Enterprise. This represents a milestone shaped by customer and community feedback since we first announced Terraform MCP server last year. In this post, we'll explore how Terraform MCP server improves infrastructure team productivity through AI-assisted workflows, maintains security by design, and provides flexible deployment options for teams of any size.

Accelerate infrastructure workflows with AI

Teams previously spent significant time on repetitive tasks: searching documentation, interpreting plan files, and auditing configurations. Terraform MCP server shifts this burden to AI assistants, allowing engineers to focus on strategic work rather than routine operations.

Generate code using your organization's standards

Before, engineers manually searched private registries for approved modules, copied examples, and verified compliance with organizational policies. This process was time-consuming and error-prone, often resulting in inconsistent infrastructure patterns across teams.

Now, AI assistants can connect directly to your Terraform or Terraform Enterprise private registry. They discover approved modules, understand your organization's patterns, and generate compliant code automatically. This eliminates the need to manually search modules and ensures consistent infrastructure across your organization, reducing both development time and compliance risk.

Access Terraform workspace data and configurations

Managing infrastructure across multiple workspaces requires constant context switching between tools and interfaces. Traditionally, engineers navigate through web UIs or CLI commands to gather information about workspace configurations and variables, a fragmented workflow that slows down troubleshooting and decision-making.

Terraform MCP server provides AI assistants with direct access to workspace data and configurations. Users can ask questions like "Which workspaces haven't been updated in 90 days?" or "Show me workspaces managing more than 1,000 resources," and receive immediate answers. This unified access eliminates context switching, enabling teams to gain faster insights and make informed decisions without leaving their development environment.

Understand plan changes with context

Terraform plan output can be difficult to interpret, especially for complex infrastructure changes. Engineers have traditionally spent time manually parsing plan files, tracing resource dependencies, and assessing the impact of modifications before approval.

Terraform MCP server now enables AI assistants to analyze plan details and explain changes in natural language. This reduces the risk of misinterpreting plans and speeds up code review cycles, helping teams move faster while maintaining confidence in their infrastructure changes.

Security by design

For infrastructure teams, security is non-negotiable. Terraform MCP server acts as a controlled interface that enforces your existing Terraform authentication and authorization. AI assistants receive only the specific information needed to answer questions, and not the credentials or sensitive data, reducing the risk of exposure while maintaining the security boundaries you've already established. The server includes CORS policies, rate limiting, and OpenTelemetry integration for monitoring and security auditing.

Flexible deployment options

Terraform MCP server supports deployment modes that fit how your team works. For individual developers, local execution provides the fastest setup and keeps all data on your machine, ideal for personal development and testing. For teams requiring centralized management, the server can be deployed as a shared service that team members access remotely while maintaining individual access controls through their own Terraform tokens.

Both deployment modes enforce the same authentication model, credentials remain in the deployment environment, while AI assistants receive only necessary metadata and configuration data needed to respond to queries.

Get started with Terraform MCP server

Terraform MCP server works with multiple AI assistants, including IBM Bob, Claude Desktop, GitHub Copilot, and other MCP-compatible tools. To get started:

·      Read the documentation on setting up the MCP server.

·      View the private registry tutorial

·      Go to the GitHub repo

New to Terraform? Sign up for an HCP account to get started today and check out our tutorials. HCP Terraform includes a $500 credit that allows users to quickly get started using features from any plan, including HCP Terraform Premium. Contact our sales team if you’re interested in trying our self-managed offering: Terraform Enterprise.



from HashiCorp Blog https://ift.tt/nfwIHMC
via IFTTT

The Gentlemen Ransomware Claims 478 Victims, Can Spread Like a Worm

A new analysis of The Gentlemen operation has revealed that the financially motivated threat group initially operated as an affiliate responsible for conducting double extortion attacks, while leveraging resources from various ransomware-as-a-service (RaaS) schemes like LockBit (aka Tenacious Mantis), Qilin (aka Pestilent Mantis), and Medusa (aka Venomous Mantis).

According to a detailed report published by PRODAFT, the group, which it tracks as Phantom Mantis, is led by a Russian-speaking cybercriminal tracked as LARVA-368, who goes by the monikers hastalamuerte, ArmCorp, zeta88, nobody0, and santamuerte. The Gentlemen is known to be active since March 2025, claiming a total of 478 victims to date, per data from Ransomware.Live.

"In July 2025, Phantom Mantis transitioned into The Gentlemen, an independent partnership program no longer dependent on other RaaS groups," the Swiss cybersecurity company said. "Additionally, LARVA-368 relies heavily on artificial intelligence for the development and maintenance of ransomware and tools, as well as for assistance with post-exploitation procedures."

As for LARVA-368, the threat actor is assessed to have been a member of the Embargo (aka Primeval Mantis) ransomware group before launching their own operation under the name ArmCorp. It was subsequently rebranded to The Gentlemen four months later.

The individual's identity has since been outed by cybersecurity journalist Brian Krebs as a 36-year-old Alexander Andreevich Yapaev (Япаев Алексанр Андреевич) from the Russian city of Izhevsk. PRODAFT told The Hacker News that its findings match the same persona with "high confidence."

As detailed by Dark Atlas in August 2025, the shift coincided with a payment dispute between LARVA-368 and Qilin, with the threat actor accusing the RaaS operation of carrying out an exit scam and defrauding them of $48,000.

"Although Phantom Mantis was a very active affiliate group with over 20 targets registered on its affiliate panel in less than 30 days, the group's admin (LARVA-368) and LARVA-367 (aka DevMan), a former Phantom Mantis's member, claimed that Pestilent Mantis was scamming affiliates and that there was an alleged 'backdoor' within the Pestilent Mantis's affiliate panel victim chats," PRODAFT noted.

"Although we could not confirm these claims, there is a chance that LARVA-368 and LARVA-367 intentionally spread disinformation with the intent of recruiting Pestilent Mantis affiliates to Phantom Mantis by discrediting the group."

Phantom Mantis has also been observed paying for Premium accounts on underground forums to boost their visibility and fend off competition, with the group's communication and the technical support handled by a separate Russian-speaking persona named The Gentlemen Data.

Some of the other salient aspects of the extortion scheme compiled from various reports are as follows -

  • In an analysis of the ransomware in late last year, LevelBlue's Cybereason team described The Gentlemen as a "highly adaptive, fast-moving ransomware operation" that combines mature ransomware techniques with RaaS features, double extortion, cross-platform lockers, and flexible propagation, and affiliate support.
  • The group has emerged as one of the most active threat actors, accounting for 10% of ransomware activity in April 2026. "The Gentlemen follows an enterprise-focused chain beginning with initial access, via vulnerable internet-facing services or stolen credentials," NCC Group said. "Analysis suggests The Gentlemen can adapt and change tactics during an attack, such as manipulating GPOs, compromising privileged accounts, and using custom methods to bypass endpoint protections."
  • Only about 13% of their victims are based in the U.S. The majority of the victims are concentrated in Thailand, the U.K., Brazil, Germany, and India.
  • LARVA-368 uses The Gentlemen IM app accounts to support affiliates regarding encryption and any intrusion-related issue, such as providing EDR killers to bypass security solutions via the bring your own vulnerable driver (BYOVD) technique.
  • Support services for both The Gentlemen and The Gentlemen Data are available via Tox, SimpleX Chat, and Ricochet Refresh open-source messaging platforms.
  • Potential affiliates are required to provide the administrator at least 1GB of data exfiltrated from a victim to gain access to the affiliate panel, a tactic designed to prevent researchers and law enforcement authorities from gaining access to the infrastructure under the guise of an affiliate. The affiliate panel supports user management, configuring new targets, and downloading ransomware to a specific target.
  • Phantom Mantis provides five versions of ransomware that are designed for Windows, Linux, ESXi, Windows XP+, and Logical Volume Manager (LVM).
  • The group courts affiliates with an aggressive profit-sharing model: 90% for affiliates and 10% for the operator.
  • Initial access is obtained via edge devices such as VPN appliances, firewalls, and other internet-facing systems, with a specific focus on platforms like Cisco and Fortinet FortiGate.
  • Infection chains involve the use of red team utilities like NetExec, RelayKing, TaskHound, PrivHound, and CertiHound to perform Active Directory discovery, certificate abuse, privilege escalation, and file share discovery. A separate set of tools, such as EDRStartupHinder, gfreeze, glinker, and DumpBrowserSecrets, are used for evading security programs, while Velociraptor is employed for command-and-control (C2).
  • The attacks also attempt to clear System, Application, and Security Windows Event Logs, disable Microsoft Defender, and add antivirus exclusions.
  • The ransomware makes use of a hybrid cryptographic scheme: X25519 key exchange combined with XChaCha20 symmetric encryption.
  • Microsoft, which is tracking the cluster under the moniker Storm-2697, said the ransomware is written in Go and obfuscated with Garble to target the Windows environment. "When enabled with the --spread argument, it turns the malware from a single-host encryptor into a self-propagating worm that attempts to deploy its encryptor to every reachable system on the network," the tech giant said. "If the --wipe argument is provided, The Gentlemen ransomware performs an additional post-encryption routine to eliminate recoverable artifacts from disk."
  • According to ZeroFox, the ransomware crew likely runs a multi-channel extortion operation, combining ransomware attacks with email outreach and phone-based pressure tactics targeting victims.
  • The group implements a "highly responsive development cycle," an aspect exemplified by the release of a same-day patch after a decryptor was released in April 2026.
  • The average dwell time of an intrusion ranges from two to six weeks from initial access to encryption, with the group particularly focusing on organizations running VMware infrastructure.

Last month, a leak of an internal Rocket.Chat database used by the group - comprising 3,366 messages between November 2025 to late April 2026 - has shed further light on the group's inner workings, including its use of known security flaws in VMware Aria Operations, Fortinet, Cisco, and Microsoft software, while painting a picture of a criminal enterprise whose members have a clear division of roles and responsibilities.

"The group actively tracks and evaluates modern vulnerabilities, including CVE-2024-55591, CVE-2025-32433, and CVE-2025-33073, and combines them with technique-driven paths like backup and management-controller abuse and NTLM relay workflows, giving them a flexible exploitation pipeline," Check Point said.

That's not all. In March 2026, Hunt.io said it discovered an open directory hosted at "176.120.22[.]127:80" on the Russian bulletproof hosting provider Proton66 that exposed 126 files containing a complete ransomware operator toolkit attributed to a The Gentlemen RaaS affiliate.

This included tools for reconnaissance, privilege escalation, defense evasion, credential theft, lateral movement, persistence, and pre-encryption preparation, essentially spanning all phases of the intrusion lifecycle.

"LARVA-368 is a threat actor specializing in extortion-related activities and has been active since at least 2020," PRODAFT said. "The expertise acquired through previous collaborations with various RaaS groups provided the technical foundation necessary to establish The Gentlemen RaaS."



from The Hacker News https://ift.tt/QY8rldZ
via IFTTT