refactor(stream-test): real-wheel into view, no synthetic-click fallback
Address review accuracy: humanClick used DOM scrollIntoViewIfNeeded and fell back to Playwright locator.click() when an element had no box - neither is real input. Now it brings elements into view with a real wheel scroll and throws if there is no on-screen box (no synthetic click). Header comment and README corrected: xdotool injects synthetic X input (not a physical HID device), and all actions are real input while the CDP/DOM API is used only to read state.
This commit is contained in:
@@ -8,13 +8,16 @@ real browsing session captured from the X display.
|
|||||||
until stopped. All params from `.env` (`DISCORD_SELFBOT_TOKEN`,
|
until stopped. All params from `.env` (`DISCORD_SELFBOT_TOKEN`,
|
||||||
`DISCORD_GUILD_ID`, `DISCORD_VOICE_CHANNEL_ID`, `VNC_RESOLUTION`,
|
`DISCORD_GUILD_ID`, `DISCORD_VOICE_CHANNEL_ID`, `VNC_RESOLUTION`,
|
||||||
`VNC_FRAMERATE`, `VNC_BITRATE_KBPS`, `STREAM_HW`, `VNC_DISPLAY`).
|
`VNC_FRAMERATE`, `VNC_BITRATE_KBPS`, `STREAM_HW`, `VNC_DISPLAY`).
|
||||||
- `human.mjs` - human-like interaction helpers. Real mouse/keyboard via
|
- `human.mjs` - human-like interaction helpers. Input is injected into the X
|
||||||
`xdotool` (so the cursor is visible in the stream); Playwright only locates
|
server with `xdotool` (synthetic X input, not a physical HID device, but the
|
||||||
elements. Every action is real input: address-bar navigation (Ctrl+L +
|
browser and the captured screen see genuine pointer/keyboard events with a
|
||||||
typing), search typing, clicking the video / settings menu / autoplay toggle /
|
visibly moving cursor); Playwright only locates elements. Every action is such
|
||||||
play button, fullscreen via the `f` key, scrolling, and entering links. The
|
input: address-bar navigation (Ctrl+L + typing), search typing, clicking the
|
||||||
CDP/DOM API is used only to read state for verification, and as a rare click
|
video / settings menu / autoplay toggle / play button, fullscreen via the `f`
|
||||||
fallback when an element has no on-screen box.
|
key, and scrolling. Elements are brought into view with a real wheel scroll
|
||||||
|
(no DOM scrollIntoView); if an element has no on-screen box the click fails
|
||||||
|
rather than falling back to a synthetic click. The CDP/DOM API is used only to
|
||||||
|
read state for verification, never to act.
|
||||||
- `scenario.mjs` - the browse scenario (YouTube -> IU live -> 1080p ->
|
- `scenario.mjs` - the browse scenario (YouTube -> IU live -> 1080p ->
|
||||||
fullscreen -> Naver -> 나무위키), driven with the human helpers. Connects to a
|
fullscreen -> Naver -> 나무위키), driven with the human helpers. Connects to a
|
||||||
Chrome already running with `--remote-debugging-port` (`CDP_PORT`, default
|
Chrome already running with `--remote-debugging-port` (`CDP_PORT`, default
|
||||||
|
|||||||
@@ -1,12 +1,15 @@
|
|||||||
// Human-like interaction helpers: drive the REAL X mouse/keyboard via xdotool
|
// Human-like interaction helpers. Drive input with xdotool, using Playwright
|
||||||
// so the cursor visibly moves and is captured by the screen stream, using
|
// only to LOCATE elements and read state.
|
||||||
// Playwright only to LOCATE elements and read state. This is the default
|
|
||||||
// interaction mode for the browse scenarios.
|
|
||||||
//
|
//
|
||||||
// Note: only the user-visible browsing actions are real input (cursor move,
|
// What xdotool actually is: it injects input events into the X server (it is
|
||||||
// click, scroll, char-by-char typing). Behind-the-scenes control (window
|
// NOT a physical HID device). The browser and the captured screen receive them
|
||||||
// fullscreen, play, quality, autoplay toggle, page navigation, and click
|
// as genuine pointer/keyboard input, with a visibly moving cursor. Every ACTION
|
||||||
// fallbacks) intentionally uses the CDP/DOM API for reliability.
|
// here is such input: cursor move, click, char-by-char typing, key presses, and
|
||||||
|
// wheel scroll - including (in scenario.mjs) navigation, quality, fullscreen and
|
||||||
|
// the autoplay toggle. The CDP/DOM API is used only to READ state for
|
||||||
|
// verification, never to perform an action. Elements are brought into view with
|
||||||
|
// a real wheel scroll (not a DOM scrollIntoView); if an element has no on-screen
|
||||||
|
// box, the click fails rather than falling back to a synthetic click.
|
||||||
import { execFile } from 'node:child_process';
|
import { execFile } from 'node:child_process';
|
||||||
|
|
||||||
const DISPLAY = process.env.VNC_DISPLAY || ':1';
|
const DISPLAY = process.env.VNC_DISPLAY || ':1';
|
||||||
@@ -55,12 +58,27 @@ export async function humanClickXY(sx, sy) {
|
|||||||
await sleep(rand(130, 300));
|
await sleep(rand(130, 300));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Locate a Playwright element, move the real cursor into it (random offset), click.
|
// Bring an element into view using a REAL wheel scroll (not a DOM
|
||||||
|
// scrollIntoView). Returns its viewport box, or null if it can't be revealed.
|
||||||
|
async function bringIntoView(page, locator) {
|
||||||
|
const ih = await page.evaluate(() => window.innerHeight);
|
||||||
|
for (let i = 0; i < 14; i++) {
|
||||||
|
const box = await locator.boundingBox().catch(() => null);
|
||||||
|
if (box && box.y >= 70 && box.y + box.height <= ih - 70) return box;
|
||||||
|
const button = box ? (box.y < 70 ? '4' : '5') : '5'; // 4=up, 5=down
|
||||||
|
await xdo(['click', button]); await xdo(['click', button]); await xdo(['click', button]);
|
||||||
|
await sleep(rand(120, 240));
|
||||||
|
}
|
||||||
|
return await locator.boundingBox().catch(() => null);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Locate a Playwright element, real-wheel it into view, move the real cursor
|
||||||
|
// into it (random offset), and click. No synthetic-click fallback: if the
|
||||||
|
// element has no on-screen box, this throws.
|
||||||
export async function humanClick(page, locator) {
|
export async function humanClick(page, locator) {
|
||||||
await locator.scrollIntoViewIfNeeded().catch(() => {});
|
|
||||||
await sleep(rand(150, 380));
|
await sleep(rand(150, 380));
|
||||||
const box = await locator.boundingBox();
|
const box = await bringIntoView(page, locator);
|
||||||
if (!box) { await locator.click({ timeout: 5000 }).catch(() => {}); return; }
|
if (!box) throw new Error('humanClick: element has no on-screen box; refusing synthetic click');
|
||||||
const { ox, oy } = await contentOrigin(page);
|
const { ox, oy } = await contentOrigin(page);
|
||||||
const sx = Math.round(ox + box.x + box.width * rand(0.35, 0.65));
|
const sx = Math.round(ox + box.x + box.width * rand(0.35, 0.65));
|
||||||
const sy = Math.round(oy + box.y + box.height * rand(0.35, 0.65));
|
const sy = Math.round(oy + box.y + box.height * rand(0.35, 0.65));
|
||||||
|
|||||||
Reference in New Issue
Block a user